http://doc.desktopgrid.hu/
for up-to-date information and documentation.
Introduction
The DC-API was created by MTA SZTAKI to allow easy implementation and deployment of distributed applications on multiple grid environments.
In order to accommodate the needs of very different grid environments, the DC-API supports only a restricted master-worker programming model. The restrictions include:
Master-worker concept: there is a designated master process running somewhere on the grid infrastructure. The master process can submit worker processes called work units.
Every work unit is a sequential application.
There is support for limited messaging between the master and the running work units. It can be used to send status and control messages, but it is not suitable for parallel programming.
There can not be any direct communication between work units.
General concepts
Programming model
DC-API applications consist of two major components: a master application and one or more client applications. The master is responsible for dividing the global input data into smaller chunks and distributing these chunks in the form of work units. Interpreting the output generated by the work units and combining them to form a global output is also the job of the master.
The master application usually runs as a daemon, but it is also possible to write a master that runs periodically (e.g. from cron), processes the outstanding events, and exits.
Client applications are simple sequential programs that take their input from the master, perform some computation on it and produce some output.
Writing a master application
A typical master application does the following steps:
Initializes the DC-API library by calling DC_initMaster() int DC_initMaster (const char *configFile); Initializes the master side of the DC-API. This function must be called first before calling any other DC-API functions.
configFile : name of the configuration file to use. NULL means use DC_CONFIG_FILE.
Returns : 0 if successful or a DC_ErrorCode. function.Calls the DC_setResultCB() function and optionally some of the DC_setSubresultCb(), DC_setMessageCb(), DC_setSuspendCb() and DC_setValidateCb() functions, depending on the features (messaging, subresults etc.) it wants to use.
In its main loop, the master calls the DC_createWU() DC_Workunit* DC_createWU (const char *clientName, const char *arguments[], int subresults, const char *tag); Creates a new work unit. The work unit is not known to the underlying grid infrastructure until DC_submitWU() is called. function to create new work units when needed. If the total number of work units is small (depending on the grid infrastructure), then the master may also create all the work units in advance. If the total number of work units is too large for this, the master may use the DC_getWUNumber() function to determine the number of running work units, and create new work units only if this number falls below a certain threshold.
Also in its main loop the master calls the DC_processMasterEvents() int DC_processMasterEvents (int timeout); Processes work unit events. In case of a work unit completes and its result is available,
or if a message or a subresult has arrived, the appropriate callback functions will be called.
The received event will be automatically destroyed when the callback function returns
(including the deletion of files if the event was a result or subresult). function that checks for outstanding events and invokes the appropriate callbacks.Alternatively, the master may use the DC_waitMasterEvents() DC_MasterEvent* DC_waitMasterEvent (const char *wuFilter, int timeout); Checks for events and returns them directly. This function does not invoke any event-processing callbacks.
Contrary to DC_processMasterEvents(), the application is responsible for destroying the returned event when it is no longer needed. and DC_waitWUEvent() DC_MasterEvent* DC_waitWUEvent (DC_Workunit *wu, int timeout); Checks for events for a particular work unit. This function does not invoke any event-processing callbacks. functions instead of DC_processMasterEvents() int DC_processMasterEvents (int timeout); Processes work unit events. In case of a work unit completes and its result is available,
or if a message or a subresult has arrived, the appropriate callback functions will be called.
The received event will be automatically destroyed when the callback function returns
(including the deletion of files if the event was a result or subresult). if it prefers to receive event structures instead of using callbacks.
Writing a client application
A typical client application performs the following steps:
Initializes the DC-API library by calling DC_initClient() int DC_initClient (void); Initializes the client API. This function must be called first before calling any other DC-API functions. function.
Identifies the location of its input/output files by calling the DC_resolveFileName() char* DC_resolveFileName (DC_FileType type, const char *logicalFileName); Resolves the local name of input/output files. The real name (and path) of an input/output file may be different from what the client expects. This function performs the translation from the logical names used by the client to the real names used by the infrastructure. function.
Note
The client application may not assume that it can read/create/write any files other than the names returned by DC_resolveFileName().During the computation, the client should periodically call the DC_checkClientEvent() DC_ClientEvent* DC_checkClientEvent (void); Checks for client control events. The returned event should be destroyed using DC_destroyClientEvent() when it is no longer needed. If the returned event is DC_CLIENT_CHECKPOINT but the client does not support checkpointing, it should still call the DC_checkpointMade() function with a NULL argument to inform the grid infrastructure that no checkpoint will be delivered. function and process the received events.
If possible, the client should call the DC_fractionDone() void DC_fractionDone (double fraction); Informs the controlling environment about the fraction of the work already done. Ideally this should be the CPU time used so far divided by the total CPU time that will be needed for the computation. function with the fraction of the work completed. On some grid infrastructures (e.g. BOINC) this will allow the client's supervisor process to show the progression of the application to the user.
Ideally the value passed to the DC_fractionDone() void DC_fractionDone (double fraction); Informs the controlling environment about the fraction of the work already done. Ideally this should be the CPU time used so far divided by the total CPU time that will be needed for the computation. function should be proportional to the time elapsed so far compared to the total time that will be needed to complete the computation.
The client should call the DC_finishClient() void DC_finishClient (int exitcode); Finishes computation. Tells the DC-API to finish this work unit and start a new one. All output files are transferred to the master and the master is notified about the completion of the work unit. function at the end of the computation. As a result all output files will be sent to the master and the master will be notified about the completion of the work unit.
Messaging
The DC-API provides limited messaging functionality between the master application and the clients. The DC-API has the following features and restrictions:
Messages are not reliable in the sense that if the client is not actually running when a message is being sent to it (e.g. because it is queued by the backend grid infrastructure), then the message may be silently dropped.
The ordering of messages is not necessarily maintained.
Messages are delivered asynchronously. There is no limit for the time elapsed before a message is actually delivered.
Due to the above restrictions, DC-API messages are not suitable for message-based parallel processing. They are meant for sending short status messages about long-running operations, or for sending control messages like a command to cancel a given computation.



