Code Comments
Programming Forum and web based access to our favorite programming groups.Hi All, I have several monitored processes, who report 'ok' down a socket to a guardian. If any process fails to report then we stop kicking the hardware watchdog and warm reset. my design is as follows: 1 'comms' thread for each process, which listens to socket, blocked on read. When read returns checks that the correct thing was sent, and then gives a binary semaphore. loops back to wait on read. a 'watchdog kicking' thread, which blocks on all semaphores that are given by the comms threads. it takes the semaphores as they become given, but never gives them. When all semaphores have been given kicks the hardware watchdog. loops back to blocking on semaphores. I think this will achieve my aims, but am I going to run into problems with it? Is there a better way (in POSIX) to syncronise threads? thanks Dave
Post Follow-up to this messagedavid.sander...@bem.fki-et.com schrieb: > Hi All, > I have several monitored processes, who report 'ok' down a socket to a > guardian. > If any process fails to report then we stop kicking the hardware > watchdog and warm reset. > > my design is as follows: > > 1 'comms' thread for each process, which listens to socket, blocked on > read. > When read returns checks that the correct thing was sent, and then > gives a binary semaphore. > loops back to wait on read. > > a 'watchdog kicking' thread, which blocks on all semaphores that are > given by the comms threads. > it takes the semaphores as they become given, but never gives them. > When all semaphores have been given kicks the hardware watchdog. > loops back to blocking on semaphores. > > I think this will achieve my aims, but am I going to run into problems > with it? > Is there a better way (in POSIX) to syncronise threads? > > thanks > > Dave using a thread doing almost nothing for each monitored process seems to me a bit like an overkill. Also you do not see or log which daemon causes the trouble, making debugging difficult. Also, PThreads (POSIX threads) does not implement semaphores. It uses Mutexes, joins and condition variables for synchronisation. In your case, a condition variable (which must be surrounded by a mutex) should be used. Hubble.
Post Follow-up to this messagedavid.sanderson@bem.fki-et.com wrote: > Hi All, > I have several monitored processes, who report 'ok' down a socket to a > guardian. > If any process fails to report then we stop kicking the hardware > watchdog and warm reset. I could not understand then part. > my design is as follows: > > 1 'comms' thread for each process, which listens to socket, blocked on > read. > When read returns checks that the correct thing was sent, and then > gives a binary semaphore. loops back to wait on read. Did you mean it increments the semaphore? > a 'watchdog kicking' thread, which blocks on all semaphores that are > given by the comms threads. > it takes the semaphores as they become given, but never gives them. > When all semaphores have been given kicks the hardware watchdog. > loops back to blocking on semaphores. Are you saying that only when the thread acquires all semaphore it does something with the watchdog? > I think this will achieve my aims, but am I going to run into problems > with it? > Is there a better way (in POSIX) to syncronise threads? I don't think you need more than one thread here with a simple select() loop.
Post Follow-up to this messageHubble wrote: > david.sander...@bem.fki-et.com schrieb: > > > using a thread doing almost nothing for each monitored process seems to > me a bit like an overkill. On reflection it does seem a bit ott, but i thought the correct way to handle multiple accepts on a socket was to pass the listen off to a seperate thread? currently there will be 6 monitored processes, but this might grow, we are still in the early stages of this project. > Also you do not see or log which daemon > causes the trouble, making debugging difficult. > When the thread gives the semaphore it can write to a log saying that xxx process is ok of course in a single threaded read from multiple sockets as each one reports ok the same can be done. > Also, PThreads (POSIX threads) does not implement semaphores. It uses > Mutexes, joins and condition variables for synchronisation. In your > case, a condition variable (which must be surrounded by a mutex) should > be used. > Are you saying I cant use a semaphore in a thread, or that I shouldnt? ** Dave
Post Follow-up to this messagedavid.sanderson@bem.fki-et.com wrote: > Hubble wrote: > > On reflection it does seem a bit ott, but i thought the correct way to > handle > multiple accepts on a socket was to pass the listen off to a seperate > thread? > currently there will be 6 monitored processes, but this might grow, we > are still in the > early stages of this project. A select loop would suffice. [] > Are you saying I cant use a semaphore in a thread, or that I shouldnt? > ** You can use semaphores, they are a part of POSIX. If you wait until several binary semaphores get all released, you may simplify it by using a counter+mutex+condition instead of several semaphores. Note, if you use one thread with select() you don't need any semaphores/mutexes.
Post Follow-up to this messagedavid.sanderson@bem.fki-et.com schrieb: > Are you saying I cant use a semaphore in a thread, or that I shouldnt? > ** My old documentation of PThreads did not mention semaphores. I think they weren't even present in this version. So I
them with SysV semaphores which I would never use. Hubble.
Post Follow-up to this message<Snipped bit describing how I want to monitor several processes and kick a hw watchdog only when they all report in as ok> Not sure if I mentioned this before, but this is embedded XScale CPU running QNX, but I would like to be as POSIX portable as possible. Having read the help docs for my system on select I was a little (well ok quite a lot), so I wrote something in a way I understand. Critique welcome, Im still learning... ********************* Header file************************************ #ifndef PIG_SOCKET_H #define PIG_SOCKET_H const unsigned short int PIGPort = 1600; const int SQUEAL_MSG_SIZE = 12; //PROCESS_OK\n class PMS_PIG_Socket_Report { public: PMS_PIG_Socket_Report(int PauseBetweenHealthReports); ~PMS_PIG_Socket_Report(void); void Run(void); private: void Connect(void); void Squeal(void); int PauseBetweenSqueals; int socketDescriptor; //handle for sending down }; struct SocketList { SocketList* NextSocket; int Socket; }; class PMS_PIG_Socket_Receive { public: PMS_PIG_Socket_Receive(void); ~PMS_PIG_Socket_Receive(void); void Listen(void); private: void SpawnCommsThread(SocketList* pSockListHead); int listenSocket, connectSocket; }; #endif //PIG_SOCKET_H ******************** Source File************************************ * #include <stdio.h> #include <unistd.h> #include <stdlib.h> #include <arpa/inet.h> #include <sys/types.h> #include <sys/socket.h> #include <netdb.h> #include <netinet/in.h> #include <unistd.h> #include <iostream> #include <string.h> #include <pthread.h> #include "PIG_Socket.h" PMS_PIG_Socket_Report::PMS_PIG_Socket_Re port(int PauseBetweenHealthReports): PauseBetweenSqueals(PauseBetweenHealthRe ports) { Connect(); //assumes we are wanting to connect as we are being constructed and this class doesnt do anything else... } PMS_PIG_Socket_Report::~PMS_PIG_Socket_R eport(void) { //hope never to get here, so we wont do anything if we do... } void PMS_PIG_Socket_Report::Run(void) { while(1) { sleep(PauseBetweenSqueals);// usleep might be better, depends on the watchdog timeout? Squeal(); } } void PMS_PIG_Socket_Report::Connect(void) { struct sockaddr_in serverAddress; struct hostent *hostInfo; hostInfo = gethostbyname("localhost"); // we are currently only going to monitor processes on THIS processor. if (hostInfo == NULL) { printf( "problem interpreting host: localhost\n"); exit(1); // perhaps a bit harsh, might need fixing later } // Create a socket, we want reliable (TCP) transport socketDescriptor = socket(AF_INET, SOCK_STREAM, 0); if (socketDescriptor < 0) # { printf( "cannot create socket\n"); exit(1); //harsh... } // Connect to the PIG monitoring. serverAddress.sin_family = hostInfo->h_addrtype; memcpy((char *) &serverAddress.sin_addr.s_addr, hostInfo->h_addr_list[0], hostInfo->h_length); serverAddress.sin_port = htons(PIGPort); if (connect(socketDescriptor,(struct sockaddr*) &serverAddress, sizeof(serverAddress)) < 0) { printf("cannot connect\n"); exit(1);//harsh } } void PMS_PIG_Socket_Report::Squeal(void) { // send "Process OK" down the socket. if (send(socketDescriptor, "Process OK\n", strlen("Process OK\n") + 1, 0) < 0) { printf("cannot send data "); close(socketDescriptor); exit(1); } } / **************************************** ****************************/ static int MonitoredProcessCount; // keep a count so we know if we've been through them all. //the thread that listens for the squeals and then kicks the wdog void* ThreadSockRcvFunc( void* arg ) { char line[SQUEAL_MSG_SIZE]; SocketList* pCurrentSocket = (SocketList*)arg; int Count=0; while(1) // run forever { while(Count<MonitoredProcessCount) { if(recv(pCurrentSocket->Socket, line, SQUEAL_MSG_SIZE, 0) > 0) { printf("RX'd from process %d\n",Count); pCurrentSocket = pCurrentSocket->NextSocket; // as the list is circular this will be the begining when we get to the end... } else { printf("recv returned less than zero\n"); // dont expect to get here, but diagnostic if we do. } Count++; //next one } printf("Processed all requests, Kick the wdog and return to the top of the list\n"); //TODO actually KICK the WATCHDOG Count = 0; sleep(1);// probably not required? or maybe to long } } PMS_PIG_Socket_Receive::PMS_PIG_Socket_R eceive(void) { // dont think we have anything to do here? // we could call listen, but it does not return, //which is not a nice thing to do... } PMS_PIG_Socket_Receive::~PMS_PIG_Socket_ Receive(void) { //never expect to be destructed in the normal run of things. } //this could be in the Listen code? void PMS_PIG_Socket_Receive::SpawnCommsThread (SocketList* pSockListHead) { pthread_attr_t attr; pthread_attr_init( &attr ); pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED ); pthread_create( NULL, &attr, &ThreadSockRcvFunc, pSockListHead ); } void PMS_PIG_Socket_Receive::Listen(void) { // accept the incoming connection and palm it of to a new port number with a thread servicing it. socklen_t clientAddressLength; struct sockaddr_in clientAddress, serverAddress; SocketList* SocketHead; SocketList* SocketTail; bool bSpawnedCommsThread = false; // Create socket for listening for client connection requests. listenSocket = socket(AF_INET, SOCK_STREAM, 0); if (listenSocket < 0) { printf("cannot create listen socket\n"); exit(1); } //bind, use hton as we might need to run on x86 or Xscale serverAddress.sin_family = AF_INET; serverAddress.sin_addr.s_addr = htonl(INADDR_ANY); serverAddress.sin_port = htons(PIGPort); if (bind(listenSocket, (struct sockaddr *) &serverAddress, sizeof(serverAddress)) < 0) { printf("cannot bind socket\n"); exit(1); } //now we wait for connections from clients. listen(listenSocket, 5); while (1) { // Accept a connection with a client that is requesting one. The // accept() call is a blocking call; i.e., this thread of // execution stops until a connection comes in. // connectSocket is a new socket that the system provides, // separate from listenSocket. This is put into a linked list of sockets to listen to // run in a seperate thread. clientAddressLength = sizeof(clientAddress); connectSocket = accept(listenSocket,(struct sockaddr*) &clientAddress, &clientAddressLength); if (connectSocket < 0) { printf("cannot accept connection\n "); exit(1);//perhaps a bit harsh? } else { //Add this to the linked list of sockets we are waiting for. SocketList* pSockList; pSockList = new SocketList; //and add one to the count for monitoring. MonitoredProcessCount++; if(!bSpawnedCommsThread) // this is the first connection { pSockList->NextSocket = pSockList; pSockList->Socket = connectSocket; SocketHead = pSockList; SocketTail = pSockList; SpawnCommsThread(pSockList); bSpawnedCommsThread = true; } else //bugger about with the list pointers... { pSockList->Socket = connectSocket; SocketTail->NextSocket = pSockList; pSockList->NextSocket = SocketHead; SocketTail = pSockList; } } } } *********************** end source file********************************* This gets built into a shared library so that all processes which need monitoring can use it without having to copy/paste the code. I think this is a good idea? to use it I have a couple of dinky programs: *******************Monitored process source************************* #include <cstdlib> #include <iostream> #include "PIG_Socket.h" int main(int argc, char *argv[]) { PMS_PIG_Socket_Report* Piglet; Piglet = new PMS_PIG_Socket_Report(1); //1 second pauses Piglet->Run(); // this never returns. return EXIT_SUCCESS; } ******************* end monitored process**************************** ******************monitoring process********************************* #include <cstdlib> #include <iostream> #include "PIG_Socket.h" int main(int argc, char *argv[]) { PMS_PIG_Socket_Receive* thePIGCatcher; thePIGCatcher = new PMS_PIG_Socket_Receive(); thePIGCatcher->Listen(); // this does not return. return EXIT_SUCCESS; } ******************end monitoring process********************************* Ultimatly the monitored process will run the socket in its lowest prioirty thread, and the monitor will monitor them. The idea is that if some process is to busy to run its lowest priority thread then it will not report ok, and the watchdog will recover the system. Dave
Post Follow-up to this message
Show a Printable Version
Email This Page to Someone!
Receive updates to this thread
Powered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.