High performance computing systems consume and dissipate a great amount of power. Excessive heat dissipation requires aggressive cooling and extra space that adds to the power consumption and infrastructure cost. Moreover, as the sizes of the system as well as the system temperature rapidly increase, high system failure rates are observed. Thus, a feature of interest for scheduling scientific applications in such environments is support for fault detection and management. This characterizes the quality aspect of the time-to-solution. A solution to the application-level resilience to faults problem must meet the following requirements: (i) Efficiency, without compromising performance; (ii) The reliability level must be user controlled – greater reliability incurs a higher cost (either in terms of resources, CPU time, energy consumption, or allocation price); and (iii) Minimal code changes in the application. Scheduling algorithms that detect faults and are able to manage them are called fault tolerant (or resilient to faults). The most common fault tolerance strategies include task replication (via double or triple modular redundancy) and application checkpointing. However, it is unclear which of the existing solutions will scale to the size of the exascale computing systems expected by the beginning of the next decade.
Sponsor: University of Basel
Award Restrictions
Applicants for this award must meet the following criteria:
Host Countries | Switzerland |
---|---|
Field Of Study List | Communications, Computer & Information Systems, Engineering and Computer Animation |
Other Criteria | Required qualifications We are looking for candidates who are highly motivated to conduct quality research, publish in top venues, and pursue a doctoral degree in Computer Science, with a focus on High Performance Computing. Applicants must have: A Master’s degree (or equivalent) in Computer Science, Computer Engineering, or Mathematics Very good programming skills (C, C++, Java); Very good knowledge of operating systems, in particular Linux; Fluency in English (verbally and in writing), while knowledge of German, although not required, can be a plus Strong team-working abilities; and Good analytical skills. Experience in carrying out research projects and writing scientific articles will be considered a plus. Knowledge of hardware components specifications and computing systems monitoring is also a plus. |
Host Institution:
This award is linked to a specific institution.
Award Specifications
Additional information about this award.
Amount | 100% employment |
---|---|
Number of Awards | 1 |
Deadline | September 06 |
Contact Information
How to find out more information about this award
You must be logged in to view contact information. Log InRegister
Registering with InternationalStudent.com gives you full access to all of the scholarship contact information as
well as one-click contact capabilities within the Not a student? Register as an Administrator instead. |
---|