370CT: Parallel Programming - 3

Dr Carey Pridgeon


Created: 2017-03-07 Tue 08:15


Setting Scheduling in OpenMP

#pragma omp for schedule (type)
  • Scheduling allows us to specify how the threads openMP generates will handle the data they need to process.
  • There are several options, all of which are set at compile time.
  • In OpenMP, the selected schedule is in use, and non changeable during runtime.
  • Omitting this leaves the choice to OpenMP.


#pragma omp for schedule (static)
  • Divide the loops into equal-sized chunks or as equal as possible.
  • Chunks are then allocated to threads.


#pragma omp for schedule (dynamic)
  • Use the internal work queue to give a chunk-sized block of loop iterations to each thread.
  • When a thread is finished, it retrieves the next block of loop iterations from the top of the work queue.


#pragma omp for schedule (guided)
  • Similar to dynamic scheduling, but the chunk size starts off large and decreases to better handle load imbalance between iterations.


#pragma omp for schedule (auto)
  • When schedule (auto) is specified, the decision regarding scheduling is delegated to the compiler.
  • This would only work if the program will be executed on a similer machine to that it is compiled on.


#pragma omp for schedule (runtimes)
  • Uses the OMP schedule environment variable to specify which one of the three loop-scheduling types should be used.
  • This can be useful, since the program needs not be recompiled to change the schedule type.
  • How you set the environment variable is platform dependant, so you’d need to look that up, but in general it’s:
setenv OMP_SCHEDULE ”type”

Schedule Summary

  • As with many features of OpenMP, the ability to set scheduling is of less importance as computers become more advanced.
  • It may be useful on a specialised system, but mostly it won't be.

Environment Variables

Environment Variables

  • As we just covered, you can set OpenMP specific Environment Variables
setenv OMP_SCHEDULE ”dynamic”

Thread count

  • This will set the global number of threads per threadpool.
  • It can be over-ridden by an explicit thread num call in the running program.
  • It will not work if the number of threads specified is above that the system can support.

dynamic Thread Count Adjustment

  • You can enable or disable dynamic adjustment of the number of threads available for execution of parallel regions.
  • Valid values are TRUE or FALSE.

(This is of questionable value however).

Data sharing in Parallel programming

Data sharing in Parallel programming - 1

  • You will recall from pThreads that sharing data between threads that must be written to in each thread incurs overheads.
  • These data dependencies will slow any parallel program to a crawl.
  • Thus if we are to properly parallelise an algorithm we must minimise or remove these dependencies.
  • Removal is required if your code is to be considered at all scaleable.

Data sharing in Parallel programming - 2

  • Any OMP structured block must be constructed so no thread depends on the results from another or updates values used by another thread.
  • If state information must be shared, then take advantage of the fork and join model.
  • The join region, where multiple threads are collapsed to one thread is the place to do this.

Data sharing in Parallel programming - 3

  • You cannot pass data in variables to separate threads in OpenMP at design time, because the threads do not exist in your code.
  • Any passed variable that is private to a thread is passed empty.
  • A good rule of thumb is if your OpenMP algorithm will share data being updated in multiple threads between those threads, you have bad code.

Data sharing in Parallel programming - 4

  • OpenMP does have facilities to allow you to share data between threads, so it is possible.
  • Possible does not mean wise however it’s just best to avoid it.
  • If we want to make parallel code properly parallel and state information must be shared:
    • Each object that requires the state of other objects should store that state internally before the structured block begins.
    • You can in fact have a structured block whose entire purpose is this data sharing.

Nested Loops

Parallelising nested loops - 1

  • Nested loops come in two forms nested and perfectly nested.
  • Just nested means there is code in the outer loop before the inner loop, or more than one inner loop.
#pragma omp parallel for
for(x=0;x<n;x++) {
    //code thingys
    #pragma omp parallel for
    For(y=0;y<m;y++) {
        //code thingys

Parallelising nested loops - 2

  • Perfectly Nested Loops.
#pragma omp parallel for collapse(2)
for(x=0;x<n;x++) {
    For(y=0;y<m;y++) {
        //code thingys
  • Becomes
#pragma omp parallel for 
for(x=0;x<n*m;x++) {
    //code thingys


  • Write a simple nested loop to make parallel. This nested loop should sum, and display, the total iterations completed.


What does a Section do?

  • The Sections worksharing construct gives a different structured block to each thread.
  • This means that, unlike normal parralel structured blocks, you can run different operations/functions in each thread.

Using Sections - 1

  • Using Sections gives you the ability to run threaded code in your parallel program whose results might not be needed by subsequent operations.
#pragma omp sections
#pragma omp section
        // something()
#pragma omp section
        // something()

Using Sections - 2

  • Sections can be used with the nowait clause
#pragma omp sections nowait
  • nowait will allow functions to be launched and then either allowed to complete at different times, or run for the runtime of the program.
  • In essense, Sections give you access to a more concurrent form of threading.
  • Menus.
  • Socket managing threads.
  • Some other thing I haven't thought of.

Sections Practice

  • Write a menu thread to control a simple openmp program.
  • This should have options to launch a parallel for loop in a function multiple times (each time the option is selected), and exit the program.
  • The Sections statement needs to use nowait.
  • Another section should do some math calculation and print the result.
  • Following the Sections bit, have another parallel for loop that repeats, to show that the sections threads are still running.