COMP528

COMP528
Assignment Resits (2018/19)
4 assignments, each worth 10% of total
your letter will indicate which (if any) assignments you are expected to resit
Resits questions comparable to original, testing same learning etc
you will get lots of hints and help by going back to the lab work and to previous assignments all codes to be written in C, compiled and benchmarked on Chadwick standards of academic integrity expected (as per original) - reports may go through “TurnItIn”
for automatic checking will be marked on the code & report, for correctness & understanding of the topics Submission each assignment as a single zip file (comprising Report & code & any scripts (plus any
supporting evidence you wish)) submission to SAM: 91 for resit#1, 92 for resit#1, 93 for resit#3, 94 for resit#4
DEADLINE for all submissions: 10am, Friday 9th August 2019
Assignment 1: MPI
Assignment #1 Resit
Testing knowledge of
parallel programming & MPI & timing via a batch system
TASK: least squares regression – parallelisation using MPI
https://www.mathsisfun.com/data/least-squares-regression.html
for set of discrete points (x[i], y[i]), the best linear fit y=mx + b
using given equations (next slide) to determine m & b
write two C codes to determine m and b for a given input set of x,y
i. A serial test code
ii. One using MPI parallelism
use the Intel compiler and compile with no optimisation ‘-O0’
time the section of the code (running in batch) that finds m & b, and do this
on various number of MPI processes and discuss your findings e.g. in terms of
speed-up and parallel efficiency (and Amdahl’s Law)
Assignment #1 Resit
Remember:
can parallelise where lots of
independent work
MPI is single code with each
process having its own “rank”
(useful to split up work?)
MPI provides “Reduction” calls e.g. for doing summation over processes
and storing result on “root” process (or on all processes)
MPI provides timing MPI_Wtime function, and the wall-clock time is the
difference between two consecutive calls to MPI_Wtime
that N may not be equally divisible by the number of MPI processes
(available via MPI_Comm_size function)
https://www.mathsisfun.com/data/least-squares-regression.html
Assignment #1 Resit Data suggestion: use a small set of input data (x,y), to check you are getting the

代写COMP528作业、代做C++编程语言作业、C/C++语言作业代写、
correct answer (serially and for any number of MPI processes); once all good,
then use data for the assignment (as below). Remember to use the batch
system to undertake your timings for different numbers of MPI processes
Assignment data:
N=100,000
x[i] = (float)i/1000.0 for i=
1 to i=99,999 note we start at i=1 and go to N-1
y[i] = sin(x[i]/500.0) / cos(x[i]/499.0 + x[i]) you will need to include <math.h>
Assignment #1 Resit Code Submit both serial & MPI code
Submit any scripts used
Report: up to 3 pages
Discussion of your approach & of your results
Give command that you use to
Compile
Submit and run your parallel code
The equation of the best fit straight line
Marking
Correctness of codes: 50%
Explaining/understanding parallel principles & MPI: 25%
Discussion of results: 25%
Assignment 2: OpenMP
Assignment #2 Resit
Testing knowledge of
parallel programming & OpenMP & timing via a batch system
TASK: least squares regression – parallelisation using OpenMP
(see Assignment#1 for detailed description)
for set of points discrete points (x[i], y[i]), the best linear fit y=mx + b
using given equations (next slide) to determine m & b
use the same assignment data as described for Assignment#1 Resit
write a C code to determine m and b for a given input set of x,y that uses
OpenMP work-sharing constructs to parallelise the wor
k use the Intel compiler and compile with no optimisation ‘-O0’
time the section of the code (running in batch) that finds m & b, and do this
on various number of OpenMP threads and discuss your findings e.g. in terms
of speed-up and parallel efficiency (and Amdahl’s Law)
Assignment #2 Resit
Remember: can parallelise where lots of independent work
OpenMP is single code with fork-join parallel regions in which each thread
having its own thread number. Typically parallelise at the ‘for’ loop level
OpenMP provides a “Reduction” clause e.g. for doing summation over
processes and storing result on “master” thread
OpenMP provides timing omp_get_wtime function, and the wall-clock
time is the difference between two consecutive calls
OpenMP loop parallelisation can have different “schedules” which may be
useful for irregular work distribution between threads
You can use compiler flags to ignore all OpenMP.
Assignment #2 Resit
Code
Submit OpenMP code
Submit any scripts used
Report: up to 3 pages
Discussion of your approach & of your results
Give command that you use to
Compile
Submit and run your parallel code
The equation of the best fit straight line
Marking
Correctness of code: 50%
Explaining/understanding parallel principles & MPI: 25%
Discussion of results: 25%
Assignment 3: GPU Programming
Assignment #3 Resit
Testing knowledge of
parallel programming of GPUs
TASK: discretization using GPU
Function f(x) = exp(x/3.1) - x*x*x*x*18.0
You need to discretize this between x=0.0 and x=60.0 and find the minimum
using 33M points
Write a C-based code with an accelerated kernel written in either CUDA or
using OpenACC directives; the code should
time a serial run comprising setting values and then finding minimum (i.e. all on the CPU)
time an accelerated run with values set on the GPU, passed back to CPU and the
minimum found on the CPU
Assignment #3 Resit
Reminder for CUDA
write C + CUDA kernel in file e.g. myCode.cu (note the .cu suffix)
compile (on login node):
module load cuda-8.0
nvcc -Xcompiler -fopenmp myCode.cu
debug running in batch
qrsh -l gputype=tesla,h_rt=00:10:00 -pe smp 1-16 -V -cwd ./a.out
timing run in batch (hogging all GPU & CPU cores for yourself)
qrsh -l gputype=tesla,exclusive,h_rt=00:10:00 -pe smp 16 -V -cwd ./a.out
For openACC
please see lecture notes
Assignment #3 Resit
Code
Submit code and any scripts used
Report: up to 3 pages
Discussion of your approach & of your results
including how speed ratio of GPU to CPU
noting whether you include GPU memory & data costs (and what effect this would have)
Give command that you use to
Compile, submit and run your parallel code
Value of the minimum of f(x[i]) and for which value of x[i] this occurs
Marking
Correctness of code: 40%
Explaining/understanding parallel principles & GPUs: 30%
Discussion of results: 30%
Assignment 4: hybrid programming
Assignment #4 Resit
Testing knowledge of
parallel programming & hybrid MPI+OpenMP parallelism
TASK: hybrid MPI+OpenMP parallelisation of galaxy formation
using the C code “COMP528-assign4-resit.c” provided in Sub-Section “Resit
Assignments” at https://cgi.csc.liv.ac.uk/~mkbane/COMP528/
add MPI and OpenMP to accelerate the simulation (including, if appropriate, the
initialisation); as per the original assignment, use MPI to parallelise at a coarse
grained level (dividing the number of bodies (variable “BODIES”) between the
number of processes) and each MPI process then using OpenMP to parallelise its
work
use the Intel compiler and compile with optimisation flag ‘-O2’
time the section of the code (running in batch) that simulates the movement of the
galaxies, and do this on various number of MPI processes & OpenMP threads
Assignment #4 Resit
Code - submit MPI+OpenMP code & any scripts used
Report: up to 3 pages
Discussion of your approach & of your results
how you determined what to parallelise & explain why you chose the given parallelisation method
the results (accuracy, speed-up, parallel efficiency)
which combination of MPI/OpenMP you found to be the fastest
Include a paragraph on what you would need to scale the number of BODIES by 100
orders of magnitude (and keep run time about the same)
e.g. is Barkla big enough? is CPU the only option?
State commands that you use to
Compile, submit, run & time your code to get timing data presented
Marking
Code: 30%
Explaining/understanding parallel principles used: 25%
Discussion on scaling by 100 orders of magnitude: 20%
Discussion of results: 25%
? Good luck
Ask if any questions!

因为专业，所以值得信赖。如有需要，请加QQ：99515681 或邮箱：[email protected]

微信：codehelp

猜你喜欢