For Programmers: Free Programming Magazines  


Home > Archive > Unix Programming > July 2006 > Watchdog thread syncronisation with semphores?









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Watchdog thread syncronisation with semphores?
david.sanderson@bem.fki-et.com

2006-07-18, 4:00 am

Hi All,
I have several monitored processes, who report 'ok' down a socket to a
guardian.
If any process fails to report then we stop kicking the hardware
watchdog and warm reset.

my design is as follows:

1 'comms' thread for each process, which listens to socket, blocked on
read.
When read returns checks that the correct thing was sent, and then
gives a binary semaphore.
loops back to wait on read.

a 'watchdog kicking' thread, which blocks on all semaphores that are
given by the comms threads.
it takes the semaphores as they become given, but never gives them.
When all semaphores have been given kicks the hardware watchdog.
loops back to blocking on semaphores.

I think this will achieve my aims, but am I going to run into problems
with it?
Is there a better way (in POSIX) to syncronise threads?

thanks

Dave

Hubble

2006-07-18, 4:00 am


david.sander...@bem.fki-et.com schrieb:

> Hi All,
> I have several monitored processes, who report 'ok' down a socket to a
> guardian.
> If any process fails to report then we stop kicking the hardware
> watchdog and warm reset.
>
> my design is as follows:
>
> 1 'comms' thread for each process, which listens to socket, blocked on
> read.
> When read returns checks that the correct thing was sent, and then
> gives a binary semaphore.
> loops back to wait on read.
>
> a 'watchdog kicking' thread, which blocks on all semaphores that are
> given by the comms threads.
> it takes the semaphores as they become given, but never gives them.
> When all semaphores have been given kicks the hardware watchdog.
> loops back to blocking on semaphores.
>
> I think this will achieve my aims, but am I going to run into problems
> with it?
> Is there a better way (in POSIX) to syncronise threads?
>
> thanks
>
> Dave


using a thread doing almost nothing for each monitored process seems to
me a bit like an overkill. Also you do not see or log which daemon
causes the trouble, making debugging difficult.

Also, PThreads (POSIX threads) does not implement semaphores. It uses
Mutexes, joins and condition variables for synchronisation. In your
case, a condition variable (which must be surrounded by a mutex) should
be used.

Hubble.

Maxim Yegorushkin

2006-07-18, 4:00 am


david.sanderson@bem.fki-et.com wrote:
> Hi All,
> I have several monitored processes, who report 'ok' down a socket to a
> guardian.
> If any process fails to report then we stop kicking the hardware
> watchdog and warm reset.


I could not understand then part.

> my design is as follows:
>
> 1 'comms' thread for each process, which listens to socket, blocked on
> read.
> When read returns checks that the correct thing was sent, and then
> gives a binary semaphore. loops back to wait on read.


Did you mean it increments the semaphore?

> a 'watchdog kicking' thread, which blocks on all semaphores that are
> given by the comms threads.
> it takes the semaphores as they become given, but never gives them.
> When all semaphores have been given kicks the hardware watchdog.
> loops back to blocking on semaphores.


Are you saying that only when the thread acquires all semaphore it does
something with the watchdog?

> I think this will achieve my aims, but am I going to run into problems
> with it?
> Is there a better way (in POSIX) to syncronise threads?


I don't think you need more than one thread here with a simple select()
loop.

david.sanderson@bem.fki-et.com

2006-07-18, 4:00 am

Hubble wrote:
> david.sander...@bem.fki-et.com schrieb:
>
>
> using a thread doing almost nothing for each monitored process seems to
> me a bit like an overkill.


On reflection it does seem a bit ott, but i thought the correct way to
handle
multiple accepts on a socket was to pass the listen off to a seperate
thread?
currently there will be 6 monitored processes, but this might grow, we
are still in the
early stages of this project.

> Also you do not see or log which daemon
> causes the trouble, making debugging difficult.
>


When the thread gives the semaphore it can write to a log saying that
xxx process is ok
of course in a single threaded read from multiple sockets as each one
reports ok
the same can be done.

> Also, PThreads (POSIX threads) does not implement semaphores. It uses
> Mutexes, joins and condition variables for synchronisation. In your
> case, a condition variable (which must be surrounded by a mutex) should
> be used.
>

Are you saying I cant use a semaphore in a thread, or that I shouldnt?
**


Dave

Maxim Yegorushkin

2006-07-18, 4:00 am

david.sanderson@bem.fki-et.com wrote:
> Hubble wrote:
>
> On reflection it does seem a bit ott, but i thought the correct way to
> handle
> multiple accepts on a socket was to pass the listen off to a seperate
> thread?
> currently there will be 6 monitored processes, but this might grow, we
> are still in the
> early stages of this project.


A select loop would suffice.

[]

> Are you saying I cant use a semaphore in a thread, or that I shouldnt?
> **


You can use semaphores, they are a part of POSIX.

If you wait until several binary semaphores get all released, you may
simplify it by using a counter+mutex+condition instead of several
semaphores. Note, if you use one thread with select() you don't need
any semaphores/mutexes.

Hubble

2006-07-18, 7:59 am


david.sanderson@bem.fki-et.com schrieb:

> Are you saying I cant use a semaphore in a thread, or that I shouldnt?
> **


My old documentation of PThreads did not mention semaphores. I think
they weren't even present in this version. So I them with SysV
semaphores which I would never use.

Hubble.

david.sanderson@bem.fki-et.com

2006-07-21, 3:59 am

<Snipped bit describing how I want to monitor several processes and
kick a hw watchdog only when they all report in as ok>

Not sure if I mentioned this before, but this is embedded XScale CPU
running QNX, but I would like to be as POSIX portable as possible.

Having read the help docs for my system on select I was a little (well
ok quite a lot) , so I wrote something in a way I understand.

Critique welcome, Im still learning...


********************* Header file************************************


#ifndef PIG_SOCKET_H
#define PIG_SOCKET_H

const unsigned short int PIGPort = 1600;
const int SQUEAL_MSG_SIZE = 12; //PROCESS_OK\n

class PMS_PIG_Socket_Report
{
public:
PMS_PIG_Socket_Report(int PauseBetweenHealthReports);
~PMS_PIG_Socket_Report(void);
void Run(void);
private:

void Connect(void);
void Squeal(void);

int PauseBetweenSqueals;
int socketDescriptor; //handle for sending down

};


struct SocketList
{
SocketList* NextSocket;
int Socket;
};

class PMS_PIG_Socket_Receive
{
public:
PMS_PIG_Socket_Receive(void);
~PMS_PIG_Socket_Receive(void);
void Listen(void);
private:

void SpawnCommsThread(SocketList* pSockListHead);
int listenSocket, connectSocket;
};
#endif //PIG_SOCKET_H


******************** Source File************************************
*
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <arpa/inet.h>

#include <sys/types.h>
#include <sys/socket.h>

#include <netdb.h>
#include <netinet/in.h>
#include <unistd.h>
#include <iostream>
#include <string.h>
#include <pthread.h>

#include "PIG_Socket.h"

PMS_PIG_Socket_Report::PMS_PIG_Socket_Re
port(int
PauseBetweenHealthReports):
PauseBetweenSqueals(PauseBetweenHealthRe
ports)
{
Connect(); //assumes we are wanting to connect as we are being
constructed and this class doesnt do anything else...
}

PMS_PIG_Socket_Report::~PMS_PIG_Socket_R
eport(void)
{
//hope never to get here, so we wont do anything if we do...
}

void PMS_PIG_Socket_Report::Run(void)
{
while(1)
{
sleep(PauseBetweenSqueals);// usleep might be better, depends on
the watchdog timeout?
Squeal();
}
}

void PMS_PIG_Socket_Report::Connect(void)
{

struct sockaddr_in serverAddress;
struct hostent *hostInfo;

hostInfo = gethostbyname("localhost"); // we are currently only going
to monitor processes on THIS processor.
if (hostInfo == NULL) {
printf( "problem interpreting host: localhost\n");
exit(1); // perhaps a bit harsh, might need fixing later
}

// Create a socket, we want reliable (TCP) transport
socketDescriptor = socket(AF_INET, SOCK_STREAM, 0);
if (socketDescriptor < 0) #
{
printf( "cannot create socket\n");
exit(1); //harsh...
}

// Connect to the PIG monitoring.
serverAddress.sin_family = hostInfo->h_addrtype;
memcpy((char *) &serverAddress.sin_addr.s_addr,
hostInfo->h_addr_list[0], hostInfo->h_length);
serverAddress.sin_port = htons(PIGPort);

if (connect(socketDescriptor,(struct sockaddr*) &serverAddress,
sizeof(serverAddress)) < 0)
{
printf("cannot connect\n");
exit(1);//harsh
}
}

void PMS_PIG_Socket_Report::Squeal(void)
{
// send "Process OK" down the socket.
if (send(socketDescriptor, "Process OK\n", strlen("Process OK\n") +
1, 0) < 0)
{
printf("cannot send data ");
close(socketDescriptor);
exit(1);
}
}

/ ****************************************
****************************/




static int MonitoredProcessCount; // keep a count so we know if we've
been through them all.


//the thread that listens for the squeals and then kicks the wdog
void* ThreadSockRcvFunc( void* arg )
{
char line[SQUEAL_MSG_SIZE];
SocketList* pCurrentSocket = (SocketList*)arg;
int Count=0;
while(1) // run forever
{
while(Count<MonitoredProcessCount)
{
if(recv(pCurrentSocket->Socket, line, SQUEAL_MSG_SIZE, 0) > 0)
{
printf("RX'd from process %d\n",Count);
pCurrentSocket = pCurrentSocket->NextSocket; // as the list is
circular this will be the begining when we get to the end...
}
else
{
printf("recv returned less than zero\n"); // dont expect to get
here, but diagnostic if we do.
}
Count++; //next one
}
printf("Processed all requests, Kick the wdog and return to the top
of the list\n");
//TODO actually KICK the WATCHDOG
Count = 0;
sleep(1);// probably not required? or maybe to long
}
}


PMS_PIG_Socket_Receive::PMS_PIG_Socket_R
eceive(void)
{
// dont think we have anything to do here?
// we could call listen, but it does not return,
//which is not a nice thing to do...
}

PMS_PIG_Socket_Receive::~PMS_PIG_Socket_
Receive(void)
{
//never expect to be destructed in the normal run of things.
}

//this could be in the Listen code?
void PMS_PIG_Socket_Receive::SpawnCommsThread
(SocketList*
pSockListHead)
{
pthread_attr_t attr;
pthread_attr_init( &attr );
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED );
pthread_create( NULL, &attr, &ThreadSockRcvFunc, pSockListHead );
}

void PMS_PIG_Socket_Receive::Listen(void)
{
// accept the incoming connection and palm it of to a new port number
with a thread servicing it.
socklen_t clientAddressLength;
struct sockaddr_in clientAddress, serverAddress;
SocketList* SocketHead;
SocketList* SocketTail;

bool bSpawnedCommsThread = false;

// Create socket for listening for client connection requests.
listenSocket = socket(AF_INET, SOCK_STREAM, 0);
if (listenSocket < 0) {
printf("cannot create listen socket\n");
exit(1);
}
//bind, use hton as we might need to run on x86 or Xscale
serverAddress.sin_family = AF_INET;
serverAddress.sin_addr.s_addr = htonl(INADDR_ANY);
serverAddress.sin_port = htons(PIGPort);

if (bind(listenSocket,
(struct sockaddr *) &serverAddress,
sizeof(serverAddress)) < 0) {
printf("cannot bind socket\n");
exit(1);
}

//now we wait for connections from clients.
listen(listenSocket, 5);

while (1) {
// Accept a connection with a client that is requesting one. The
// accept() call is a blocking call; i.e., this thread of
// execution stops until a connection comes in.
// connectSocket is a new socket that the system provides,
// separate from listenSocket. This is put into a linked list of
sockets to listen to
// run in a seperate thread.

clientAddressLength = sizeof(clientAddress);
connectSocket = accept(listenSocket,(struct sockaddr*)
&clientAddress, &clientAddressLength);
if (connectSocket < 0)
{
printf("cannot accept connection\n ");
exit(1);//perhaps a bit harsh?
}
else
{
//Add this to the linked list of sockets we are waiting for.
SocketList* pSockList;
pSockList = new SocketList;
//and add one to the count for monitoring.
MonitoredProcessCount++;

if(!bSpawnedCommsThread) // this is the first connection
{
pSockList->NextSocket = pSockList;
pSockList->Socket = connectSocket;
SocketHead = pSockList;
SocketTail = pSockList;
SpawnCommsThread(pSockList);
bSpawnedCommsThread = true;
}
else //bugger about with the list pointers...
{
pSockList->Socket = connectSocket;
SocketTail->NextSocket = pSockList;
pSockList->NextSocket = SocketHead;
SocketTail = pSockList;
}
}
}
}

*********************** end source
file*********************************

This gets built into a shared library so that all processes which need
monitoring can use it without having to copy/paste the code. I think
this is a good idea?

to use it I have a couple of dinky programs:

*******************Monitored process source*************************

#include <cstdlib>
#include <iostream>
#include "PIG_Socket.h"

int main(int argc, char *argv[])
{
PMS_PIG_Socket_Report* Piglet;

Piglet = new PMS_PIG_Socket_Report(1); //1 second pauses
Piglet->Run(); // this never returns.

return EXIT_SUCCESS;
}
******************* end monitored process****************************

******************monitoring process*********************************

#include <cstdlib>
#include <iostream>
#include "PIG_Socket.h"

int main(int argc, char *argv[])
{
PMS_PIG_Socket_Receive* thePIGCatcher;
thePIGCatcher = new PMS_PIG_Socket_Receive();
thePIGCatcher->Listen(); // this does not return.
return EXIT_SUCCESS;
}

******************end monitoring
process*********************************


Ultimatly the monitored process will run the socket in its lowest
prioirty thread, and the
monitor will monitor them. The idea is that if some process is to busy
to run its lowest priority thread then it will not report ok, and the
watchdog will recover the system.

Dave

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com