BEGIN:VCALENDAR PRODID:-//Microsoft Corporation//Outlook MIMEDIR//EN VERSION:1.0 BEGIN:VEVENT DTSTART:20141118T231500Z DTEND:20141119T010000Z LOCATION:New Orleans Theater Lobby DESCRIPTION;ENCODING=QUOTED-PRINTABLE:ABSTRACT: Performance irregularities on massively parallel processors lead to load imbalances and a significant loss of performance. Multi-core nodes suggest a promising way to re-distribute work within a node, thus mitigating performance irregularities. However, there exists a non-trivial cost to redistributing work, and associated data, across cores. We investigate how work can be equitably distributed across cores without significantly disturbing data locality, and without incurring significant scheduling overhead. Towards this end, we design a series of scheduling strategies and tuning mechanisms; our foundational technique is intelligent blending of static and dynamic scheduling. We also implement a basic runtime system and library to minimize programmer effort in applying these strategies. Our techniques provide 28.16% performance gains over static scheduling and 17.13% gains over guided scheduling for a widely used regular mesh benchmark, and 44.45% gains over static scheduling and 13.06% gains over guided scheduling for an n-body simulation, both on 1024 nodes. SUMMARY:Lightweight Scheduling for Improving Load Balance Without Losing Locality PRIORITY:3 END:VEVENT END:VCALENDAR