I implemented the Merge Sort in Java and in C++ and I tried to implement them as similar as possible. Both algorithms work, I tested them many times. The problem is that my Java-Implementation is a lot faster than my C++-Implementation and I am wondering why. I can't believe that Java would be faster so I guess I made a mistake in one of the implementations. In order to measure the runtime I created a class "Person" which has two String-Attributes (forename, lastname). In C++ I used std::vector<Person*>
and in Java I used ArrayList<Person>
. Additionally I overloaded the operator<
in C++ to compare two Persons (compare the lastname, if equal compare the firstname). In Java I implemented the interface Comparable<Person>
to compare two Persons.
Can you find a mistake in my code or a reason why Java would be faster or C++ would be slower? Any help would be appreciated.
My Java Code:
public void mergeSort(List<T> list) {
if (list.size() <= 1) {
return;
}
int subLength = (int) (list.size() / 2);
List<T> first = new ArrayList<T>(list.subList(0, subLength));
List<T> second = new ArrayList<T>(list.subList(subLength, list.size()));
mergeSort(first);
mergeSort(second);
merge(first, second, list);
return;
}
private void merge(List<T> first, List<T> second, List<T> result) {
int firstPos = 0, secondPos = 0, resultPos = 0;
while (firstPos < first.size() && secondPos < second.size()) {
if (first.get(firstPos).compareTo(second.get(secondPos)) < 0) {
result.set(resultPos, first.get(firstPos));
firstPos++;
} else {
result.set(resultPos, second.get(secondPos));
secondPos++;
}
resultPos++;
}
for (int i = firstPos; i < first.size(); i++) {
result.set(resultPos, first.get(i));
resultPos++;
}
for (int i = secondPos; i < second.size(); i++) {
result.set(resultPos, second.get(i));
resultPos++;
}
}
My C++ Code:
Note: I used two template-methods to make the mergesort usable with both Person
and Person*
.
template<typename T>
T * ptr(T & obj) { return &obj; }
template<typename T>
T * ptr(T * obj) { return obj; }
void mergeSort(std::vector<T> &list) {
if (list.size() <= 1) {
return;
}
int subLength = (int)(list.size() / 2);
std::vector<T> first(list.begin(), list.begin() + subLength);
std::vector<T> second(list.begin() + subLength, list.end());
mergeSort(first);
mergeSort(second);
merge(first, second, list);
}
void merge(const std::vector<T> &first, const std::vector<T> &second, std::vector<T> &result) {
int firstPos = 0, secondPos = 0, resultPos = 0;
while (firstPos < first.size() && secondPos < second.size()) {
if (*ptr(first[firstPos]) < *ptr(second[secondPos])) {
result[resultPos] = first[firstPos];
firstPos++;
}
else {
result[resultPos] = second[secondPos];
secondPos++;
}
resultPos++;
}
for (int i = firstPos; i < first.size(); i++) {
result[resultPos] = first[i];
resultPos++;
}
for (int i = secondPos; i < second.size(); i++) {
result[resultPos] = second[i];
resultPos++;
}
}
Edit1 and 2:
My setup-configuration:
I used one million, 10 millionen and 20 millionen persons to test the implementations. I doesn't really matter with how many Persons I test it with, Java is always faster.
And I test it the following way: I create persons and initialize my MergeSort
-class. Then I start the measurement and I call my mergeSort
-method. When the sorting is finished I stop the measurement. (Deleting is happening after the time measurement). For the time measurement in Java I use System.nanoTime()
and in C++ I use chrono::high_resolution_clock::time_point
Of course I compiled C++ in "Release"-Mode (Optimization: O2, faster code preferred).
My Test-PC:
- Intel i5 2500
- 8GB DDR3 RAM
- Windows 10 Education
Edit3:
There is one thing I forgot to mention. I implemented the algorithm in a generic way in order to use simple data-types as well as objects. When I use std::vector<int>
and ArrayList<Integer>
in Java then my C++ implementation is a lot faster. My first attempt was to use std::vector<Person>
but it was even slower. So my guess was that it made deep copies instead of shallow ones and that's why I switched to Person*
because I thought that when copying is happening only the pointers will be copied.
std::list
– alain 2 hours agovector
is the best container to use here (looking at the code!)std::list
only adds overhead. (BTW*ptr(...)
can be safely omitted) – Elijan9 2 hours ago