-
Tomas Flekal authoredTomas Flekal authored
README.md 5.08 KiB
Insertion sort
Overview
- Application has 3 implementation of Insertion sort
- Plain version - this version implements single thread solution
- Parallel version - my implementation of parallel Insertion sort
- ChatGPT OMP parallel - this version was designed by me and written by chatGPT
How to build and run the application
- Download the repository from gitlab.
- Open clion and build the project.
- Run the project either from terminal or from readme with prepared examples.
Run the application with parameters
- Application has multiple parameters.
- You can choose with what parameters the application should be run.
- After every run results are validated and time is printed to stdout.
- NOTE: When loading data from file, avoid using negative values, as this program is designed for positive huge numbers only
All parameters
- accessible through --help or -h
cd cmake-build-debug && ./sem_flekal_app -h
usage: sem_flekal_app [options]
options, with default in [ ]:
-h --help prints this help message
-i=<number> specifies how long random array should be sorted
-p prints random generated array on entry and sorted one at the end
-f=<file_path> specifies path to the file from which to load data
-c=<number> specifies number of cores the program should run
if optional parameter in other option is used, last value is used
--plain runs non parallel version of program
--parallel[=num_of_core] runs parallel version of program with optional num of cores
if nothing given, maximum number of cores is used
if '-c' parameter is used, last value is used
--chatGPT_omp_parallel runs parallel version of program using omp parallel generated by chatGPT
--generate_full_report generates full report for maximum system threads
Examples of run configurations
- Plain insertion sort for array with 50000 elements
cd cmake-build-debug && ./sem_flekal_app --plain -i=50000
- Parallel insertion sort for array with 50000 elements with 8 threads
cd cmake-build-debug && ./sem_flekal_app --parallel -i=50000 -c=8
- Parallel insertion sort for array with 50000 elements with 8 threads, where data were inputted through file
cd cmake-build-debug && ./sem_flekal_app --parallel -f=../example/input1 -c=8
- ChatGPT OMP parallel insertion sort for array with 50000 elements
cd cmake-build-debug && ./sem_flekal_app --chatGPT_omp_parallel -i=50000
- Generate full report to stdout
cd cmake-build-debug && ./sem_flekal_app --generate_full_report
Report
Assigment
- Create two versions of Insertion sort and compare them. One should be parallel, second should be optimized for one thread.
- You can also compare OMP parallel with your implementation
Results
-
The measurements were conducted on commit 4b5a0891 using an 11th Gen Intel® Core™ i7-1165G7 processor @ 2.80GHz with 8 CPU cores on Ubuntu 24.04.1 LTS.
-
For each measurement, a new randomly shuffled vector of numbers was generated within a defined range.
-
For instance, the measurement for a vector of size 5000 used numbers in the range <0, 4999>, with each number appearing only once.
-
The results are presented in the table and graph below.
-
All measurements are recorded in milliseconds.
-
To replicate the measurements, use the following command:
cd cmake-build-debug && ./sem_flekal_app --generate_full_report
Performance Results for Random Array of Various Lengths
Array Length | Plain | Parallel | ChatGPT OMP Parallel |
---|---|---|---|
100 | 0 ms | 0 ms | 0 ms |
500 | 0 ms | 0 ms | 0 ms |
1000 | 0 ms | 0 ms | 0 ms |
5000 | 2 ms | 0 ms | 1 ms |
10000 | 10 ms | 1 ms | 2 ms |
15000 | 22 ms | 1 ms | 4 ms |
20000 | 38 ms | 2 ms | 4 ms |
100000 | 855 ms | 43 ms | 24 ms |
200000 | 3549 ms | 168 ms | 56 ms |
300000 | 9260 ms | 330 ms | 74 ms |
400000 | 14254 ms | 575 ms | 95 ms |
500000 | 22499 ms | 898 ms | 116 ms |
600000 | 32835 ms | 1327 ms | 160 ms |
700000 | 44944 ms | 1676 ms | 165 ms |
800000 | 58115 ms | 2183 ms | 184 ms |
900000 | 74858 ms | 2874 ms | 229 ms |
1000000 | 102077 ms | 4935 ms | 326 ms |
2000000 | 449586 ms | 20133 ms | 745 ms |
3000000 | 1576574 ms | 58261 ms | 1106 ms |
- From the results we can see that OMP parallel version is the fastest by far.
- My parallel version is faster than OMP with the shorter vectors, but around 100000 elements the omp became faster.