Institutionen för datavetenskap: Grid: arkitekturer, programvaror och tillämpningar

Assignment 2 - A fault tolerant high-throughput PSE

Due: 2007-02-08 (design) 2007-03-09 (report)

Introduction

The purpose of this assignment is to give practical experience of Globus programming. The assignment covers a broad range of GT4 services and protocols, including information services, job submission, data transfer, and security. An additional purpose of the assignment is for you to gain deeper understanding of resource discovery, scheduling, and secure communication. The assignment will also highlight the benefits and drawbacks of programming for the Grid. The assignment shall be solved in groups of two students.

Specification

Your task is to implement a system for solving embarrassingly parallel problems on the Grid using a master-worker approach. The overall goal of the system is to solve the problem as quickly as possible. The problem can be selected freely, as long as it fulfills the following requirements:

Must be embarrassingly parallel, i.e., the problem must be decomposable into subproblems that can be independently solved. Preferably, the size of the subproblems should be variable, enabling adjustment of the execution time required to solve a subproblem.
The overall solution is obtained by combining the solutions to the subproblems.
The problem must be computationally intense, and should preferably generate large amounts of output data, and/or require large amounts of input data.

Two suggested problems:

You are of course allowed (and even encouraged) to solve a problem of your own choice. Notify your choice of problem to the assistants.

Architectural overview

The system consists of a client and a problem solving environment. Conceptually, the problem solving environment can be decomposed into three modules, a problem solver, a node program and a resource broker.

The problem solver partitions the problem into subproblems, requests appropriate Grid resources from the resource broker, and submits jobs solving the subproblems to the selected Grid resources. When working on a problem, the problem solver updates the client of the progress using notifications. Once all subproblems are solved, the problem solver generates the solution to the overall problem from the solutions of the subproblems.

The node program solves one subproblem on a Grid resource selected by the broker, i.e., the node program will be submitted in a job to the Grid resource. Note that the focus of this assignment should be on the resource broker and the problem solver, do not spend much time implementing a node program.

The resource broker finds available Grid resources, classifies which of those are appropriate for running the node program and selects a resource for the subproblem. When selecting resources, the broker must consider both the current load of the resource and the performance of jobs previously executed on that resource. The latter includes both the execution times of the jobs and the ratio of the jobs that have failed. Carefully describe and motivate your resource selection algorithm in the report. One must also consider how often the resource broker should retrieve new MDS information. Too frequent updates result in large overhead, while too infrequent updates may result in poor resource selection. The broker should furthermore be able to stage files to and from the Grid resources.

The below picture shows one possible decomposition of the system into modules. Other decompositions are possible.

Requirements

The implemented system must meet the following requirements.

The system should be implemented in Java. Scripting languages are only allowed for very limited purposes. Those adventurously minded may, after discussion with the teaching assistents, use C/C++. Support and documentation are however limited for these languagese.
GridFTP must be used for all file transfers, i.e., you may not use the built-in filestaging mechanism in RSL.
All communication between modules must be authenticated.
There must be a configurable upper limit for the number of concurrently submitted jobs. The problem solver is not allowed to submit all subproblems at once and await their completion. The maximum number of active job submissions could, e.g., be limited by a command-line argument or through service configuration. (No hard-coded limit!)
The system must recover from failures in solving a subproblem. This can be done by resubmitting a new job for the subproblem. For more fine-grained error recovery techniques, see the non-mandatory extensions.
The system must avoid overloading the Grid. You are not allowed to submit new jobs to a machine with too high load. The maximum allowed load must be configurable upon system startup. One possible value for the maximum allowed load is to allow five queued jobs for each resource CPU.
As work with a problem progesses, notifications should be sent to the client. At a minimum, notifications should be sent when a subjob starts or completes.

The system must gather the following data during its execution.

Total number of executed jobs (per resource).
Total number of failed jobs (per resource).
Total time for solving the problem.
Total execution time of all jobs.

A summary of this data should be returned to the client when a problem is completed, or suspended (see below).
In addition to the basic functionality, your system must support the following.

Make the system capable of recovering from shutdowns/crashes. Maintain information about active and solved subproblems in a file (or even better, in a database), which is used on program restart to recover the pre-shutdown state, check for completion and/or re-establish contact with jobs that were active at the time of the crash/shutdown.
The system should be able to work on several problem instances simultaneously. When a new problem instance is submitted to the system, the system starts solving it. Furthermore, make it possible to suspend and resume work on a particular problem. Also implement a way to kill a problem instance that is no longer of interest. (Hint: model problem instances using WS-Resources).
The system should present work progress using a simple form of graphical user interface. A simplistic monitoring framework is supplied (here), but you are free to provide your own implementation of a GUI. You may of course also extend the existing code in any way to add additional functionality.

Non-mandatory extensions

The following extensions are not mandatory, but improves the system. The only reward for implementing these items is deeper knowledge of Grid programming, and possibly, some street-cred from the other students.

Implement a more fine-grained subproblem error recovery mechanism. If a job fails due to errors in a file transfer, it is better to retry the file transfer than to resubmit the job.
To further improve the performance of error recovery, resumed file transfers can be restarted from the point of failure instead of from the first byte of the file.
Implement support for solving more than one problem type. This requires a carefully designed problem solver.

Hints and advise

This assignment can be confronted in many different ways. The following list suggests one approach:

Try using GRAM to submit simple jobs.
Try using GridFTP for file staging.
Sketch a high-level architecture that describes the major components, their responsibilities and their interactions.
Design your solution in more detail. Determine what service(s) you will need, and define their portTypes. Also decide which existing WSRF portTypes your service(s) will extend.
Try running the real application (the node program) on the Grid.
Start experimenting with code. Write a simple problem solver prototype.
Build a client/module that retrieves MDS information.
Build a simple resource broker that bases its decisions on the MDS information.
Integrate the resource broker and the problem solver.
Generate the required statistics.
Improve the resource brokering algorithm using the collected statistics.
Improve the fault-tolerance of the system and implement the recovery mechanisms.

It is generally beneficial to build modular systems. Initially, try each component out, add more functionality and a general interface as your gain more experience of the component. Try to make it as easy as possible to plug the new component into the system.

The coupling between the problem (the node program) and the problem solver should be as loose as possible. Ideally, one should be able to use another node program without modifying the problem solver.

Libraries for the complete Globus toolkit (including GRAM, MDS etc), can be found in /pkg/gt4/4.0.1 on all machines in MA416 and MA426.

Examination

The examination of this assignment consists of four parts, namely:

System design report
Progress discussion
Final report
System demonstration

The first hand-in is a design that outlines your proposed solution. The design should include a description of the system modules, including their responsibilities, their interactions, and service portType(s). Motivate and discuss your particular design choices and what existing portType(s) your services extend. In this hand-in, you must also decide a time for your progress discussion.

The second part is the progress discussion, where you discuss your progress so far with the teaching assistants. This discussion should also cover how you will tackle the remaining work, and ventilate any concerns and problems. You may choose the date for your progress discussion freely, but this date must be set when you hand in your design.

The report shall thoroughly document the structure of your solution and motivate your design. Do however not include any test runs as the correctness of your system (or lack thereof) will be shown during the demonstration. Furthermore, the report shall contain an analysis of a larger test run. This analysis should determine the speedup of your system (compare with solving the problem using only one machine in MA446). The report should also discuss the strengths and weaknesses of the brokering algorithm.

Finally, each group of students will hold a 20 minute demonstration of their system for the teaching assistants. This demonstration is an opportunity to show the finer details of your implementation. You should also be prepared to answer questions regarding your design and brokering algorithm as well as discuss the benefits and drawbacks of your system.

The executable program must run on the department of computing science's Linux system. All code must be placed in the directory ~/edu/grid/lab2.

Due dates

2007-02-08: Design
Your choice: Progress discussion
2007-03-07: Demonstration
2007-03-09: Report

Useful links:

The Globus Toolkit 4 Programmer's tutorial
Java CoG 1.2 API (GridFTP etc.)
Globus Toolkit Java WS Core API (includes notifications)
Axis API
RSL Schema
Submitting a GRAM job in Java
Gram Service API
Gram Client API
Gram Util API
MDS Public Interface Guide
Can't find it? Try the complete Globus documentation
FAQ - will be updated with answers to your questions.

http://www.cs.umu.se/kurser/TDBD20/VT07/lab/lab2.html
Ansvarig för sidan: Erik Elmroth, P-O Östberg, Johan Tordsson
Senast ändrad 2007-01-29