Parallel and Single Threaded code having relatively same performance in Java
I wrote a program to render a Julia Set. The single threaded code is pretty straightforward and is essentially like so:
private Image drawFractal() {
BufferedImage img = new BufferedImage(WIDTH, HEIGHT, BufferedImage.TYPE_INT_ARGB);
for (int x = 0; x < WIDTH; x++) {
for (int y = 0; y < HEIGHT; y++) {
double X = map(x,0,WIDTH,-2.0,2.0);
double Y = map(y,0,HEIGHT,-1.0,1.0);
int color = getPixelColor(X,Y);
img.setRGB(x,y,color);
}
}
return img;
}
private int getPixelColor(double x, double y) {
float hue;
float saturation = 1f;
float brightness;
ComplexNumber z = new ComplexNumber(x, y);
int i;
for (i = 0; i < maxiter; i++) {
z.square();
z.add(c);
if (z.mod() > blowup) {
break;
}
}
brightness = (i < maxiter) ? 1f : 0;
hue = (i%maxiter)/(float)maxiter;
int rgb = Color.getHSBColor(hue,saturation,brightness).getRGB();
return rgb;
}
As you can see it is highly inefficient. Thus I went for Parallelizing this code using the fork/join framework in Java and this is what I came up with:
private Image drawFractal() {
BufferedImage img = new BufferedImage(WIDTH, HEIGHT, BufferedImage.TYPE_INT_ARGB);
ForkCalculate fork = new ForkCalculate(img, 0, WIDTH, HEIGHT);
ForkJoinPool forkPool = new ForkJoinPool();
forkPool.invoke(fork);
return img;
}
//ForkCalculate.java
public class ForkCalculate extends RecursiveAction {
BufferedImage img;
int minWidth;
int maxWidth;
int height;
int threshold;
int numPixels;
ForkCalculate(BufferedImage b, int minW, int maxW, int h) {
img = b;
minWidth = minW;
maxWidth = maxW;
height = h;
threshold = 100000; //TODO : Experiment with this value.
numPixels = (maxWidth - minWidth) * height;
}
void computeDirectly() {
for (int x = minWidth; x < maxWidth; x++) {
for (int y = 0; y < height; y++) {
double X = map(x,0,Fractal.WIDTH,-2.0,2.0);
double Y = map(y,0,Fractal.HEIGHT,-1.0,1.0);
int color = getPixelColor(X,Y);
img.setRGB(x,y,color);
}
}
}
@Override
protected void compute() {
if(numPixels < threshold) {
computeDirectly();
return;
}
int split = (minWidth + maxWidth)/2;
invokeAll(new ForkCalculate(img, minWidth, split, height), new ForkCalculate(img, split, maxWidth, height));
}
private int getPixelColor(double x, double y) {
float hue;
float saturation = 1f;
float brightness;
ComplexNumber z = new ComplexNumber(x, y);
int i;
for (i = 0; i < Fractal.maxiter; i++) {
z.square();
z.add(Fractal.c);
if (z.mod() > Fractal.blowup) {
break;
}
}
brightness = (i < Fractal.maxiter) ? 1f : 0;
hue = (i%Fractal.maxiter)/(float)Fractal.maxiter;
int rgb = Color.getHSBColor(hue*5,saturation,brightness).getRGB();
return rgb;
}
private double map(double x, double in_min, double in_max, double out_min, double out_max) {
return (x-in_min)*(out_max-out_min)/(in_max-in_min) + out_min;
}
}
I tested with a range of values varying the maxiter, blowup and threshold.
I made the threshold such that the number of threads are around the same as the number of cores that I have (4).
I measured the runtimes in both cases and expected some optimization in parallelized code. However the code ran in the same time if not slower sometimes. This has me baffled. Is this happening because the problem size isn't big enough? I also tested with varying image sizes ranging from 640*400 to 1020*720.
Why is this happening? How can I run the code parallely so that it runs faster as it should?
Edit
If you want to checkout the code in its entirety head over to my Github
The master branch has the single threaded code.
The branch with the name Multicore has the Parallelized code.
Edit 2 Image of the fractal for reference.
java concurrency fractals fork-join
|
show 2 more comments
I wrote a program to render a Julia Set. The single threaded code is pretty straightforward and is essentially like so:
private Image drawFractal() {
BufferedImage img = new BufferedImage(WIDTH, HEIGHT, BufferedImage.TYPE_INT_ARGB);
for (int x = 0; x < WIDTH; x++) {
for (int y = 0; y < HEIGHT; y++) {
double X = map(x,0,WIDTH,-2.0,2.0);
double Y = map(y,0,HEIGHT,-1.0,1.0);
int color = getPixelColor(X,Y);
img.setRGB(x,y,color);
}
}
return img;
}
private int getPixelColor(double x, double y) {
float hue;
float saturation = 1f;
float brightness;
ComplexNumber z = new ComplexNumber(x, y);
int i;
for (i = 0; i < maxiter; i++) {
z.square();
z.add(c);
if (z.mod() > blowup) {
break;
}
}
brightness = (i < maxiter) ? 1f : 0;
hue = (i%maxiter)/(float)maxiter;
int rgb = Color.getHSBColor(hue,saturation,brightness).getRGB();
return rgb;
}
As you can see it is highly inefficient. Thus I went for Parallelizing this code using the fork/join framework in Java and this is what I came up with:
private Image drawFractal() {
BufferedImage img = new BufferedImage(WIDTH, HEIGHT, BufferedImage.TYPE_INT_ARGB);
ForkCalculate fork = new ForkCalculate(img, 0, WIDTH, HEIGHT);
ForkJoinPool forkPool = new ForkJoinPool();
forkPool.invoke(fork);
return img;
}
//ForkCalculate.java
public class ForkCalculate extends RecursiveAction {
BufferedImage img;
int minWidth;
int maxWidth;
int height;
int threshold;
int numPixels;
ForkCalculate(BufferedImage b, int minW, int maxW, int h) {
img = b;
minWidth = minW;
maxWidth = maxW;
height = h;
threshold = 100000; //TODO : Experiment with this value.
numPixels = (maxWidth - minWidth) * height;
}
void computeDirectly() {
for (int x = minWidth; x < maxWidth; x++) {
for (int y = 0; y < height; y++) {
double X = map(x,0,Fractal.WIDTH,-2.0,2.0);
double Y = map(y,0,Fractal.HEIGHT,-1.0,1.0);
int color = getPixelColor(X,Y);
img.setRGB(x,y,color);
}
}
}
@Override
protected void compute() {
if(numPixels < threshold) {
computeDirectly();
return;
}
int split = (minWidth + maxWidth)/2;
invokeAll(new ForkCalculate(img, minWidth, split, height), new ForkCalculate(img, split, maxWidth, height));
}
private int getPixelColor(double x, double y) {
float hue;
float saturation = 1f;
float brightness;
ComplexNumber z = new ComplexNumber(x, y);
int i;
for (i = 0; i < Fractal.maxiter; i++) {
z.square();
z.add(Fractal.c);
if (z.mod() > Fractal.blowup) {
break;
}
}
brightness = (i < Fractal.maxiter) ? 1f : 0;
hue = (i%Fractal.maxiter)/(float)Fractal.maxiter;
int rgb = Color.getHSBColor(hue*5,saturation,brightness).getRGB();
return rgb;
}
private double map(double x, double in_min, double in_max, double out_min, double out_max) {
return (x-in_min)*(out_max-out_min)/(in_max-in_min) + out_min;
}
}
I tested with a range of values varying the maxiter, blowup and threshold.
I made the threshold such that the number of threads are around the same as the number of cores that I have (4).
I measured the runtimes in both cases and expected some optimization in parallelized code. However the code ran in the same time if not slower sometimes. This has me baffled. Is this happening because the problem size isn't big enough? I also tested with varying image sizes ranging from 640*400 to 1020*720.
Why is this happening? How can I run the code parallely so that it runs faster as it should?
Edit
If you want to checkout the code in its entirety head over to my Github
The master branch has the single threaded code.
The branch with the name Multicore has the Parallelized code.
Edit 2 Image of the fractal for reference.
java concurrency fractals fork-join
1
This link, gee.cs.oswego.edu/dl/html/StreamParallelGuidance.html is about parallel streams but the focus is on performance with FG. There have been a lot of questions on why parallel is sometimes slower and all the reasons are too numerous to mention here. Data Parallelism, en.wikipedia.org/wiki/Data_parallelism requires many processors to do properly -- 4 is a small number.
– edharned
Sep 1 '18 at 14:01
@edharned What do you mean when you say "FG"? I scanned over the document and it is a lot to take in. Although does this mean for this particular problem ( Julia Set ) it is worthless to write parallelized code?
– Rohit Sarkar
Sep 1 '18 at 14:11
1
FG is ForkJoin. Parallel Streams are based on FG. Search for "java parallel is sometimes slower than synchronous" turns up many questions/answers here on SO. Only you can answer the question of whether to parallelize or not. As I said above, Data Parallel requires spreading the work around many, many processors. Data Parallel used to be limited to Massively Parallel Processors (scientific/academic) before the 2, 4, 8, etc. core machines became popular. Run your problem on a 256 core machine and see what happens.
– edharned
Sep 1 '18 at 15:56
@edharned Thanks for the pointers. I will look it up. The reason I was doing this was I had seen a C++ video ( youtube.com/watch?v=Pc8DfEyAxzg ) related to the same topic of generating fractals where he optimized his code quite a lot by parallelizing it. I was hoping to achieve the same with Java. Hopefully I will be able to do it.
– Rohit Sarkar
Sep 1 '18 at 16:16
1
There are certainly several reasons of why you don't see a speedup. First: There are hardly any iterations done! Most of the fractal does not reach themaxiter
limit. Useimaginary = 0.056;
andmaxiter=1000
to generate some workload. For me, the single threaded version then takes ~1500ms, and the parallel one ~800ms. Further optimizations would probably be possible. I've seen the "multicore" branch: My gut feeling is that simply tiling the image into maybe 10x10 tiles and throwing them into a Threadpool ExecutorService could be faster (but haven't investigated it in detail)
– Marco13
Oct 30 '18 at 22:06
|
show 2 more comments
I wrote a program to render a Julia Set. The single threaded code is pretty straightforward and is essentially like so:
private Image drawFractal() {
BufferedImage img = new BufferedImage(WIDTH, HEIGHT, BufferedImage.TYPE_INT_ARGB);
for (int x = 0; x < WIDTH; x++) {
for (int y = 0; y < HEIGHT; y++) {
double X = map(x,0,WIDTH,-2.0,2.0);
double Y = map(y,0,HEIGHT,-1.0,1.0);
int color = getPixelColor(X,Y);
img.setRGB(x,y,color);
}
}
return img;
}
private int getPixelColor(double x, double y) {
float hue;
float saturation = 1f;
float brightness;
ComplexNumber z = new ComplexNumber(x, y);
int i;
for (i = 0; i < maxiter; i++) {
z.square();
z.add(c);
if (z.mod() > blowup) {
break;
}
}
brightness = (i < maxiter) ? 1f : 0;
hue = (i%maxiter)/(float)maxiter;
int rgb = Color.getHSBColor(hue,saturation,brightness).getRGB();
return rgb;
}
As you can see it is highly inefficient. Thus I went for Parallelizing this code using the fork/join framework in Java and this is what I came up with:
private Image drawFractal() {
BufferedImage img = new BufferedImage(WIDTH, HEIGHT, BufferedImage.TYPE_INT_ARGB);
ForkCalculate fork = new ForkCalculate(img, 0, WIDTH, HEIGHT);
ForkJoinPool forkPool = new ForkJoinPool();
forkPool.invoke(fork);
return img;
}
//ForkCalculate.java
public class ForkCalculate extends RecursiveAction {
BufferedImage img;
int minWidth;
int maxWidth;
int height;
int threshold;
int numPixels;
ForkCalculate(BufferedImage b, int minW, int maxW, int h) {
img = b;
minWidth = minW;
maxWidth = maxW;
height = h;
threshold = 100000; //TODO : Experiment with this value.
numPixels = (maxWidth - minWidth) * height;
}
void computeDirectly() {
for (int x = minWidth; x < maxWidth; x++) {
for (int y = 0; y < height; y++) {
double X = map(x,0,Fractal.WIDTH,-2.0,2.0);
double Y = map(y,0,Fractal.HEIGHT,-1.0,1.0);
int color = getPixelColor(X,Y);
img.setRGB(x,y,color);
}
}
}
@Override
protected void compute() {
if(numPixels < threshold) {
computeDirectly();
return;
}
int split = (minWidth + maxWidth)/2;
invokeAll(new ForkCalculate(img, minWidth, split, height), new ForkCalculate(img, split, maxWidth, height));
}
private int getPixelColor(double x, double y) {
float hue;
float saturation = 1f;
float brightness;
ComplexNumber z = new ComplexNumber(x, y);
int i;
for (i = 0; i < Fractal.maxiter; i++) {
z.square();
z.add(Fractal.c);
if (z.mod() > Fractal.blowup) {
break;
}
}
brightness = (i < Fractal.maxiter) ? 1f : 0;
hue = (i%Fractal.maxiter)/(float)Fractal.maxiter;
int rgb = Color.getHSBColor(hue*5,saturation,brightness).getRGB();
return rgb;
}
private double map(double x, double in_min, double in_max, double out_min, double out_max) {
return (x-in_min)*(out_max-out_min)/(in_max-in_min) + out_min;
}
}
I tested with a range of values varying the maxiter, blowup and threshold.
I made the threshold such that the number of threads are around the same as the number of cores that I have (4).
I measured the runtimes in both cases and expected some optimization in parallelized code. However the code ran in the same time if not slower sometimes. This has me baffled. Is this happening because the problem size isn't big enough? I also tested with varying image sizes ranging from 640*400 to 1020*720.
Why is this happening? How can I run the code parallely so that it runs faster as it should?
Edit
If you want to checkout the code in its entirety head over to my Github
The master branch has the single threaded code.
The branch with the name Multicore has the Parallelized code.
Edit 2 Image of the fractal for reference.
java concurrency fractals fork-join
I wrote a program to render a Julia Set. The single threaded code is pretty straightforward and is essentially like so:
private Image drawFractal() {
BufferedImage img = new BufferedImage(WIDTH, HEIGHT, BufferedImage.TYPE_INT_ARGB);
for (int x = 0; x < WIDTH; x++) {
for (int y = 0; y < HEIGHT; y++) {
double X = map(x,0,WIDTH,-2.0,2.0);
double Y = map(y,0,HEIGHT,-1.0,1.0);
int color = getPixelColor(X,Y);
img.setRGB(x,y,color);
}
}
return img;
}
private int getPixelColor(double x, double y) {
float hue;
float saturation = 1f;
float brightness;
ComplexNumber z = new ComplexNumber(x, y);
int i;
for (i = 0; i < maxiter; i++) {
z.square();
z.add(c);
if (z.mod() > blowup) {
break;
}
}
brightness = (i < maxiter) ? 1f : 0;
hue = (i%maxiter)/(float)maxiter;
int rgb = Color.getHSBColor(hue,saturation,brightness).getRGB();
return rgb;
}
As you can see it is highly inefficient. Thus I went for Parallelizing this code using the fork/join framework in Java and this is what I came up with:
private Image drawFractal() {
BufferedImage img = new BufferedImage(WIDTH, HEIGHT, BufferedImage.TYPE_INT_ARGB);
ForkCalculate fork = new ForkCalculate(img, 0, WIDTH, HEIGHT);
ForkJoinPool forkPool = new ForkJoinPool();
forkPool.invoke(fork);
return img;
}
//ForkCalculate.java
public class ForkCalculate extends RecursiveAction {
BufferedImage img;
int minWidth;
int maxWidth;
int height;
int threshold;
int numPixels;
ForkCalculate(BufferedImage b, int minW, int maxW, int h) {
img = b;
minWidth = minW;
maxWidth = maxW;
height = h;
threshold = 100000; //TODO : Experiment with this value.
numPixels = (maxWidth - minWidth) * height;
}
void computeDirectly() {
for (int x = minWidth; x < maxWidth; x++) {
for (int y = 0; y < height; y++) {
double X = map(x,0,Fractal.WIDTH,-2.0,2.0);
double Y = map(y,0,Fractal.HEIGHT,-1.0,1.0);
int color = getPixelColor(X,Y);
img.setRGB(x,y,color);
}
}
}
@Override
protected void compute() {
if(numPixels < threshold) {
computeDirectly();
return;
}
int split = (minWidth + maxWidth)/2;
invokeAll(new ForkCalculate(img, minWidth, split, height), new ForkCalculate(img, split, maxWidth, height));
}
private int getPixelColor(double x, double y) {
float hue;
float saturation = 1f;
float brightness;
ComplexNumber z = new ComplexNumber(x, y);
int i;
for (i = 0; i < Fractal.maxiter; i++) {
z.square();
z.add(Fractal.c);
if (z.mod() > Fractal.blowup) {
break;
}
}
brightness = (i < Fractal.maxiter) ? 1f : 0;
hue = (i%Fractal.maxiter)/(float)Fractal.maxiter;
int rgb = Color.getHSBColor(hue*5,saturation,brightness).getRGB();
return rgb;
}
private double map(double x, double in_min, double in_max, double out_min, double out_max) {
return (x-in_min)*(out_max-out_min)/(in_max-in_min) + out_min;
}
}
I tested with a range of values varying the maxiter, blowup and threshold.
I made the threshold such that the number of threads are around the same as the number of cores that I have (4).
I measured the runtimes in both cases and expected some optimization in parallelized code. However the code ran in the same time if not slower sometimes. This has me baffled. Is this happening because the problem size isn't big enough? I also tested with varying image sizes ranging from 640*400 to 1020*720.
Why is this happening? How can I run the code parallely so that it runs faster as it should?
Edit
If you want to checkout the code in its entirety head over to my Github
The master branch has the single threaded code.
The branch with the name Multicore has the Parallelized code.
Edit 2 Image of the fractal for reference.
java concurrency fractals fork-join
java concurrency fractals fork-join
edited Sep 1 '18 at 14:45
Rohit Sarkar
asked Sep 1 '18 at 13:36
Rohit SarkarRohit Sarkar
215
215
1
This link, gee.cs.oswego.edu/dl/html/StreamParallelGuidance.html is about parallel streams but the focus is on performance with FG. There have been a lot of questions on why parallel is sometimes slower and all the reasons are too numerous to mention here. Data Parallelism, en.wikipedia.org/wiki/Data_parallelism requires many processors to do properly -- 4 is a small number.
– edharned
Sep 1 '18 at 14:01
@edharned What do you mean when you say "FG"? I scanned over the document and it is a lot to take in. Although does this mean for this particular problem ( Julia Set ) it is worthless to write parallelized code?
– Rohit Sarkar
Sep 1 '18 at 14:11
1
FG is ForkJoin. Parallel Streams are based on FG. Search for "java parallel is sometimes slower than synchronous" turns up many questions/answers here on SO. Only you can answer the question of whether to parallelize or not. As I said above, Data Parallel requires spreading the work around many, many processors. Data Parallel used to be limited to Massively Parallel Processors (scientific/academic) before the 2, 4, 8, etc. core machines became popular. Run your problem on a 256 core machine and see what happens.
– edharned
Sep 1 '18 at 15:56
@edharned Thanks for the pointers. I will look it up. The reason I was doing this was I had seen a C++ video ( youtube.com/watch?v=Pc8DfEyAxzg ) related to the same topic of generating fractals where he optimized his code quite a lot by parallelizing it. I was hoping to achieve the same with Java. Hopefully I will be able to do it.
– Rohit Sarkar
Sep 1 '18 at 16:16
1
There are certainly several reasons of why you don't see a speedup. First: There are hardly any iterations done! Most of the fractal does not reach themaxiter
limit. Useimaginary = 0.056;
andmaxiter=1000
to generate some workload. For me, the single threaded version then takes ~1500ms, and the parallel one ~800ms. Further optimizations would probably be possible. I've seen the "multicore" branch: My gut feeling is that simply tiling the image into maybe 10x10 tiles and throwing them into a Threadpool ExecutorService could be faster (but haven't investigated it in detail)
– Marco13
Oct 30 '18 at 22:06
|
show 2 more comments
1
This link, gee.cs.oswego.edu/dl/html/StreamParallelGuidance.html is about parallel streams but the focus is on performance with FG. There have been a lot of questions on why parallel is sometimes slower and all the reasons are too numerous to mention here. Data Parallelism, en.wikipedia.org/wiki/Data_parallelism requires many processors to do properly -- 4 is a small number.
– edharned
Sep 1 '18 at 14:01
@edharned What do you mean when you say "FG"? I scanned over the document and it is a lot to take in. Although does this mean for this particular problem ( Julia Set ) it is worthless to write parallelized code?
– Rohit Sarkar
Sep 1 '18 at 14:11
1
FG is ForkJoin. Parallel Streams are based on FG. Search for "java parallel is sometimes slower than synchronous" turns up many questions/answers here on SO. Only you can answer the question of whether to parallelize or not. As I said above, Data Parallel requires spreading the work around many, many processors. Data Parallel used to be limited to Massively Parallel Processors (scientific/academic) before the 2, 4, 8, etc. core machines became popular. Run your problem on a 256 core machine and see what happens.
– edharned
Sep 1 '18 at 15:56
@edharned Thanks for the pointers. I will look it up. The reason I was doing this was I had seen a C++ video ( youtube.com/watch?v=Pc8DfEyAxzg ) related to the same topic of generating fractals where he optimized his code quite a lot by parallelizing it. I was hoping to achieve the same with Java. Hopefully I will be able to do it.
– Rohit Sarkar
Sep 1 '18 at 16:16
1
There are certainly several reasons of why you don't see a speedup. First: There are hardly any iterations done! Most of the fractal does not reach themaxiter
limit. Useimaginary = 0.056;
andmaxiter=1000
to generate some workload. For me, the single threaded version then takes ~1500ms, and the parallel one ~800ms. Further optimizations would probably be possible. I've seen the "multicore" branch: My gut feeling is that simply tiling the image into maybe 10x10 tiles and throwing them into a Threadpool ExecutorService could be faster (but haven't investigated it in detail)
– Marco13
Oct 30 '18 at 22:06
1
1
This link, gee.cs.oswego.edu/dl/html/StreamParallelGuidance.html is about parallel streams but the focus is on performance with FG. There have been a lot of questions on why parallel is sometimes slower and all the reasons are too numerous to mention here. Data Parallelism, en.wikipedia.org/wiki/Data_parallelism requires many processors to do properly -- 4 is a small number.
– edharned
Sep 1 '18 at 14:01
This link, gee.cs.oswego.edu/dl/html/StreamParallelGuidance.html is about parallel streams but the focus is on performance with FG. There have been a lot of questions on why parallel is sometimes slower and all the reasons are too numerous to mention here. Data Parallelism, en.wikipedia.org/wiki/Data_parallelism requires many processors to do properly -- 4 is a small number.
– edharned
Sep 1 '18 at 14:01
@edharned What do you mean when you say "FG"? I scanned over the document and it is a lot to take in. Although does this mean for this particular problem ( Julia Set ) it is worthless to write parallelized code?
– Rohit Sarkar
Sep 1 '18 at 14:11
@edharned What do you mean when you say "FG"? I scanned over the document and it is a lot to take in. Although does this mean for this particular problem ( Julia Set ) it is worthless to write parallelized code?
– Rohit Sarkar
Sep 1 '18 at 14:11
1
1
FG is ForkJoin. Parallel Streams are based on FG. Search for "java parallel is sometimes slower than synchronous" turns up many questions/answers here on SO. Only you can answer the question of whether to parallelize or not. As I said above, Data Parallel requires spreading the work around many, many processors. Data Parallel used to be limited to Massively Parallel Processors (scientific/academic) before the 2, 4, 8, etc. core machines became popular. Run your problem on a 256 core machine and see what happens.
– edharned
Sep 1 '18 at 15:56
FG is ForkJoin. Parallel Streams are based on FG. Search for "java parallel is sometimes slower than synchronous" turns up many questions/answers here on SO. Only you can answer the question of whether to parallelize or not. As I said above, Data Parallel requires spreading the work around many, many processors. Data Parallel used to be limited to Massively Parallel Processors (scientific/academic) before the 2, 4, 8, etc. core machines became popular. Run your problem on a 256 core machine and see what happens.
– edharned
Sep 1 '18 at 15:56
@edharned Thanks for the pointers. I will look it up. The reason I was doing this was I had seen a C++ video ( youtube.com/watch?v=Pc8DfEyAxzg ) related to the same topic of generating fractals where he optimized his code quite a lot by parallelizing it. I was hoping to achieve the same with Java. Hopefully I will be able to do it.
– Rohit Sarkar
Sep 1 '18 at 16:16
@edharned Thanks for the pointers. I will look it up. The reason I was doing this was I had seen a C++ video ( youtube.com/watch?v=Pc8DfEyAxzg ) related to the same topic of generating fractals where he optimized his code quite a lot by parallelizing it. I was hoping to achieve the same with Java. Hopefully I will be able to do it.
– Rohit Sarkar
Sep 1 '18 at 16:16
1
1
There are certainly several reasons of why you don't see a speedup. First: There are hardly any iterations done! Most of the fractal does not reach the
maxiter
limit. Use imaginary = 0.056;
and maxiter=1000
to generate some workload. For me, the single threaded version then takes ~1500ms, and the parallel one ~800ms. Further optimizations would probably be possible. I've seen the "multicore" branch: My gut feeling is that simply tiling the image into maybe 10x10 tiles and throwing them into a Threadpool ExecutorService could be faster (but haven't investigated it in detail)– Marco13
Oct 30 '18 at 22:06
There are certainly several reasons of why you don't see a speedup. First: There are hardly any iterations done! Most of the fractal does not reach the
maxiter
limit. Use imaginary = 0.056;
and maxiter=1000
to generate some workload. For me, the single threaded version then takes ~1500ms, and the parallel one ~800ms. Further optimizations would probably be possible. I've seen the "multicore" branch: My gut feeling is that simply tiling the image into maybe 10x10 tiles and throwing them into a Threadpool ExecutorService could be faster (but haven't investigated it in detail)– Marco13
Oct 30 '18 at 22:06
|
show 2 more comments
1 Answer
1
active
oldest
votes
Here is your code rewritten to use concurrency. I found that my Lenovo Yoga misreported the number of processors by double. Also Windows 10 seems to take up an enormous amount of processing, so the results on my laptop are dubious. If you have more cores or a decent OS, it should be much better.
package au.net.qlt.canvas.test;
import javax.swing.*;
import java.awt.*;
import java.awt.image.BufferedImage;
public class TestConcurrency extends JPanel {
private BufferedImage screen;
final Fractal fractal;
private TestConcurrency(final Fractal f, Size size) {
fractal = f;
screen = new BufferedImage(size.width, size.height, BufferedImage.TYPE_INT_ARGB);
setBackground(Color.BLACK);
setPreferredSize(new Dimension(size.width,size.height));
}
public void test(boolean CONCURRENT) {
int count = CONCURRENT ? Runtime.getRuntime().availableProcessors()/2 : 1;
Scheduler scheduler = new Scheduler(fractal.size);
Thread threads = new Thread[count];
long startTime = System.currentTimeMillis();
for (int p = 0; p < count; p++) {
threads[p] = new Thread() {
public void run() {
scheduler.schedule(fractal,screen);
}
};
threads[p].start();
try {
threads[p].join();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
DEBUG("test threads: %d - elasped time: %dms", count, (System.currentTimeMillis()-startTime));
}
@Override public void paint(Graphics g) {
if(g==null) return;
g.drawImage(screen, 0,0, null);
}
public static void main(Stringargs) {
JFrame frame = new JFrame("FRACTAL");
Size size = new Size(1024, 768);
Fractal fractal = new Fractal(size);
TestConcurrency test = new TestConcurrency(fractal, size);
frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
frame.setResizable(false);
frame.add(test);
frame.pack();
frame.setLocationRelativeTo(null);
frame.setVisible(true);
for(int i=1; i<=10; i++) {
DEBUG("--- Test %d ------------------", i);
test.test(false);
test.repaint();
test.test(true);
test.repaint();
}
}
public static void DEBUG(String str, Object ... args) { Also System.out.println(String.format(str, args)); }
}
class Fractal {
ComplexNumber C;
private int maxiter;
private int blowup;
private double real;
private double imaginary;
private static double xs = -2.0, xe = 2.0, ys = -1.0, ye = 1.0;
public Size size;
Fractal(Size sz){
size = sz;
real = -0.8;
imaginary = 0.156;
C = new ComplexNumber(real, imaginary);
maxiter = 400;
blowup = 4;
}
public int getPixelColor(Ref ref) {
float hue;
float saturation = 1f;
float brightness;
double X = map(ref.x,0,size.width,xs,xe);
double Y = map(ref.y,0,size.height,ys,ye);
ComplexNumber Z = new ComplexNumber(X, Y);
int i;
for (i = 0; i < maxiter; i++) {
Z.square();
Z.add(C);
if (Z.mod() > blowup) {
break;
}
}
brightness = (i < maxiter) ? 1f : 0;
hue = (i%maxiter)/(float)maxiter;
return Color.getHSBColor(hue*5,saturation,brightness).getRGB();
}
private double map(double n, double in_min, double in_max, double out_min, double out_max) {
return (n-in_min)*(out_max-out_min)/(in_max-in_min) + out_min;
}
}
class Size{
int width, height, length;
public Size(int w, int h) { width = w; height = h; length = h*w; }
}
class ComplexNumber {
private double real;
private double imaginary;
ComplexNumber(double a, double b) {
real = a;
imaginary = b;
}
void square() {
double new_real = Math.pow(real,2) - Math.pow(imaginary,2);
double new_imaginary = 2*real*imaginary;
this.real = new_real;
this.imaginary = new_imaginary;
}
double mod() {
return Math.sqrt(Math.pow(real,2) + Math.pow(imaginary,2));
}
void add(ComplexNumber c) {
this.real += c.real;
this.imaginary += c.imaginary;
}
}
class Scheduler {
private Size size;
private int x, y, index;
private final Object nextSync = 4;
public Scheduler(Size sz) { size = sz; }
/**
* Update the ref object with next available coords,
* return false if no more coords to be had (image is rendered)
*
* @param ref Ref object to be updated
* @return false if end of image reached
*/
public boolean next(Ref ref) {
synchronized (nextSync) {
// load passed in ref
ref.x = x;
ref.y = y;
ref.index = index;
if (++index > size.length) return false; // end of the image
// load local counters for next access
if (++x >= size.width) {
x = 0;
y++;
}
return true; // there are more pixels to be had
}
}
public void schedule(Fractal fractal, BufferedImage screen) {
for(Ref ref = new Ref(); next(ref);)
screen.setRGB(ref.x, ref.y, fractal.getPixelColor(ref));
}
}
class Ref {
public int x, y, index;
public Ref() {}
}
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52128631%2fparallel-and-single-threaded-code-having-relatively-same-performance-in-java%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Here is your code rewritten to use concurrency. I found that my Lenovo Yoga misreported the number of processors by double. Also Windows 10 seems to take up an enormous amount of processing, so the results on my laptop are dubious. If you have more cores or a decent OS, it should be much better.
package au.net.qlt.canvas.test;
import javax.swing.*;
import java.awt.*;
import java.awt.image.BufferedImage;
public class TestConcurrency extends JPanel {
private BufferedImage screen;
final Fractal fractal;
private TestConcurrency(final Fractal f, Size size) {
fractal = f;
screen = new BufferedImage(size.width, size.height, BufferedImage.TYPE_INT_ARGB);
setBackground(Color.BLACK);
setPreferredSize(new Dimension(size.width,size.height));
}
public void test(boolean CONCURRENT) {
int count = CONCURRENT ? Runtime.getRuntime().availableProcessors()/2 : 1;
Scheduler scheduler = new Scheduler(fractal.size);
Thread threads = new Thread[count];
long startTime = System.currentTimeMillis();
for (int p = 0; p < count; p++) {
threads[p] = new Thread() {
public void run() {
scheduler.schedule(fractal,screen);
}
};
threads[p].start();
try {
threads[p].join();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
DEBUG("test threads: %d - elasped time: %dms", count, (System.currentTimeMillis()-startTime));
}
@Override public void paint(Graphics g) {
if(g==null) return;
g.drawImage(screen, 0,0, null);
}
public static void main(Stringargs) {
JFrame frame = new JFrame("FRACTAL");
Size size = new Size(1024, 768);
Fractal fractal = new Fractal(size);
TestConcurrency test = new TestConcurrency(fractal, size);
frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
frame.setResizable(false);
frame.add(test);
frame.pack();
frame.setLocationRelativeTo(null);
frame.setVisible(true);
for(int i=1; i<=10; i++) {
DEBUG("--- Test %d ------------------", i);
test.test(false);
test.repaint();
test.test(true);
test.repaint();
}
}
public static void DEBUG(String str, Object ... args) { Also System.out.println(String.format(str, args)); }
}
class Fractal {
ComplexNumber C;
private int maxiter;
private int blowup;
private double real;
private double imaginary;
private static double xs = -2.0, xe = 2.0, ys = -1.0, ye = 1.0;
public Size size;
Fractal(Size sz){
size = sz;
real = -0.8;
imaginary = 0.156;
C = new ComplexNumber(real, imaginary);
maxiter = 400;
blowup = 4;
}
public int getPixelColor(Ref ref) {
float hue;
float saturation = 1f;
float brightness;
double X = map(ref.x,0,size.width,xs,xe);
double Y = map(ref.y,0,size.height,ys,ye);
ComplexNumber Z = new ComplexNumber(X, Y);
int i;
for (i = 0; i < maxiter; i++) {
Z.square();
Z.add(C);
if (Z.mod() > blowup) {
break;
}
}
brightness = (i < maxiter) ? 1f : 0;
hue = (i%maxiter)/(float)maxiter;
return Color.getHSBColor(hue*5,saturation,brightness).getRGB();
}
private double map(double n, double in_min, double in_max, double out_min, double out_max) {
return (n-in_min)*(out_max-out_min)/(in_max-in_min) + out_min;
}
}
class Size{
int width, height, length;
public Size(int w, int h) { width = w; height = h; length = h*w; }
}
class ComplexNumber {
private double real;
private double imaginary;
ComplexNumber(double a, double b) {
real = a;
imaginary = b;
}
void square() {
double new_real = Math.pow(real,2) - Math.pow(imaginary,2);
double new_imaginary = 2*real*imaginary;
this.real = new_real;
this.imaginary = new_imaginary;
}
double mod() {
return Math.sqrt(Math.pow(real,2) + Math.pow(imaginary,2));
}
void add(ComplexNumber c) {
this.real += c.real;
this.imaginary += c.imaginary;
}
}
class Scheduler {
private Size size;
private int x, y, index;
private final Object nextSync = 4;
public Scheduler(Size sz) { size = sz; }
/**
* Update the ref object with next available coords,
* return false if no more coords to be had (image is rendered)
*
* @param ref Ref object to be updated
* @return false if end of image reached
*/
public boolean next(Ref ref) {
synchronized (nextSync) {
// load passed in ref
ref.x = x;
ref.y = y;
ref.index = index;
if (++index > size.length) return false; // end of the image
// load local counters for next access
if (++x >= size.width) {
x = 0;
y++;
}
return true; // there are more pixels to be had
}
}
public void schedule(Fractal fractal, BufferedImage screen) {
for(Ref ref = new Ref(); next(ref);)
screen.setRGB(ref.x, ref.y, fractal.getPixelColor(ref));
}
}
class Ref {
public int x, y, index;
public Ref() {}
}
add a comment |
Here is your code rewritten to use concurrency. I found that my Lenovo Yoga misreported the number of processors by double. Also Windows 10 seems to take up an enormous amount of processing, so the results on my laptop are dubious. If you have more cores or a decent OS, it should be much better.
package au.net.qlt.canvas.test;
import javax.swing.*;
import java.awt.*;
import java.awt.image.BufferedImage;
public class TestConcurrency extends JPanel {
private BufferedImage screen;
final Fractal fractal;
private TestConcurrency(final Fractal f, Size size) {
fractal = f;
screen = new BufferedImage(size.width, size.height, BufferedImage.TYPE_INT_ARGB);
setBackground(Color.BLACK);
setPreferredSize(new Dimension(size.width,size.height));
}
public void test(boolean CONCURRENT) {
int count = CONCURRENT ? Runtime.getRuntime().availableProcessors()/2 : 1;
Scheduler scheduler = new Scheduler(fractal.size);
Thread threads = new Thread[count];
long startTime = System.currentTimeMillis();
for (int p = 0; p < count; p++) {
threads[p] = new Thread() {
public void run() {
scheduler.schedule(fractal,screen);
}
};
threads[p].start();
try {
threads[p].join();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
DEBUG("test threads: %d - elasped time: %dms", count, (System.currentTimeMillis()-startTime));
}
@Override public void paint(Graphics g) {
if(g==null) return;
g.drawImage(screen, 0,0, null);
}
public static void main(Stringargs) {
JFrame frame = new JFrame("FRACTAL");
Size size = new Size(1024, 768);
Fractal fractal = new Fractal(size);
TestConcurrency test = new TestConcurrency(fractal, size);
frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
frame.setResizable(false);
frame.add(test);
frame.pack();
frame.setLocationRelativeTo(null);
frame.setVisible(true);
for(int i=1; i<=10; i++) {
DEBUG("--- Test %d ------------------", i);
test.test(false);
test.repaint();
test.test(true);
test.repaint();
}
}
public static void DEBUG(String str, Object ... args) { Also System.out.println(String.format(str, args)); }
}
class Fractal {
ComplexNumber C;
private int maxiter;
private int blowup;
private double real;
private double imaginary;
private static double xs = -2.0, xe = 2.0, ys = -1.0, ye = 1.0;
public Size size;
Fractal(Size sz){
size = sz;
real = -0.8;
imaginary = 0.156;
C = new ComplexNumber(real, imaginary);
maxiter = 400;
blowup = 4;
}
public int getPixelColor(Ref ref) {
float hue;
float saturation = 1f;
float brightness;
double X = map(ref.x,0,size.width,xs,xe);
double Y = map(ref.y,0,size.height,ys,ye);
ComplexNumber Z = new ComplexNumber(X, Y);
int i;
for (i = 0; i < maxiter; i++) {
Z.square();
Z.add(C);
if (Z.mod() > blowup) {
break;
}
}
brightness = (i < maxiter) ? 1f : 0;
hue = (i%maxiter)/(float)maxiter;
return Color.getHSBColor(hue*5,saturation,brightness).getRGB();
}
private double map(double n, double in_min, double in_max, double out_min, double out_max) {
return (n-in_min)*(out_max-out_min)/(in_max-in_min) + out_min;
}
}
class Size{
int width, height, length;
public Size(int w, int h) { width = w; height = h; length = h*w; }
}
class ComplexNumber {
private double real;
private double imaginary;
ComplexNumber(double a, double b) {
real = a;
imaginary = b;
}
void square() {
double new_real = Math.pow(real,2) - Math.pow(imaginary,2);
double new_imaginary = 2*real*imaginary;
this.real = new_real;
this.imaginary = new_imaginary;
}
double mod() {
return Math.sqrt(Math.pow(real,2) + Math.pow(imaginary,2));
}
void add(ComplexNumber c) {
this.real += c.real;
this.imaginary += c.imaginary;
}
}
class Scheduler {
private Size size;
private int x, y, index;
private final Object nextSync = 4;
public Scheduler(Size sz) { size = sz; }
/**
* Update the ref object with next available coords,
* return false if no more coords to be had (image is rendered)
*
* @param ref Ref object to be updated
* @return false if end of image reached
*/
public boolean next(Ref ref) {
synchronized (nextSync) {
// load passed in ref
ref.x = x;
ref.y = y;
ref.index = index;
if (++index > size.length) return false; // end of the image
// load local counters for next access
if (++x >= size.width) {
x = 0;
y++;
}
return true; // there are more pixels to be had
}
}
public void schedule(Fractal fractal, BufferedImage screen) {
for(Ref ref = new Ref(); next(ref);)
screen.setRGB(ref.x, ref.y, fractal.getPixelColor(ref));
}
}
class Ref {
public int x, y, index;
public Ref() {}
}
add a comment |
Here is your code rewritten to use concurrency. I found that my Lenovo Yoga misreported the number of processors by double. Also Windows 10 seems to take up an enormous amount of processing, so the results on my laptop are dubious. If you have more cores or a decent OS, it should be much better.
package au.net.qlt.canvas.test;
import javax.swing.*;
import java.awt.*;
import java.awt.image.BufferedImage;
public class TestConcurrency extends JPanel {
private BufferedImage screen;
final Fractal fractal;
private TestConcurrency(final Fractal f, Size size) {
fractal = f;
screen = new BufferedImage(size.width, size.height, BufferedImage.TYPE_INT_ARGB);
setBackground(Color.BLACK);
setPreferredSize(new Dimension(size.width,size.height));
}
public void test(boolean CONCURRENT) {
int count = CONCURRENT ? Runtime.getRuntime().availableProcessors()/2 : 1;
Scheduler scheduler = new Scheduler(fractal.size);
Thread threads = new Thread[count];
long startTime = System.currentTimeMillis();
for (int p = 0; p < count; p++) {
threads[p] = new Thread() {
public void run() {
scheduler.schedule(fractal,screen);
}
};
threads[p].start();
try {
threads[p].join();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
DEBUG("test threads: %d - elasped time: %dms", count, (System.currentTimeMillis()-startTime));
}
@Override public void paint(Graphics g) {
if(g==null) return;
g.drawImage(screen, 0,0, null);
}
public static void main(Stringargs) {
JFrame frame = new JFrame("FRACTAL");
Size size = new Size(1024, 768);
Fractal fractal = new Fractal(size);
TestConcurrency test = new TestConcurrency(fractal, size);
frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
frame.setResizable(false);
frame.add(test);
frame.pack();
frame.setLocationRelativeTo(null);
frame.setVisible(true);
for(int i=1; i<=10; i++) {
DEBUG("--- Test %d ------------------", i);
test.test(false);
test.repaint();
test.test(true);
test.repaint();
}
}
public static void DEBUG(String str, Object ... args) { Also System.out.println(String.format(str, args)); }
}
class Fractal {
ComplexNumber C;
private int maxiter;
private int blowup;
private double real;
private double imaginary;
private static double xs = -2.0, xe = 2.0, ys = -1.0, ye = 1.0;
public Size size;
Fractal(Size sz){
size = sz;
real = -0.8;
imaginary = 0.156;
C = new ComplexNumber(real, imaginary);
maxiter = 400;
blowup = 4;
}
public int getPixelColor(Ref ref) {
float hue;
float saturation = 1f;
float brightness;
double X = map(ref.x,0,size.width,xs,xe);
double Y = map(ref.y,0,size.height,ys,ye);
ComplexNumber Z = new ComplexNumber(X, Y);
int i;
for (i = 0; i < maxiter; i++) {
Z.square();
Z.add(C);
if (Z.mod() > blowup) {
break;
}
}
brightness = (i < maxiter) ? 1f : 0;
hue = (i%maxiter)/(float)maxiter;
return Color.getHSBColor(hue*5,saturation,brightness).getRGB();
}
private double map(double n, double in_min, double in_max, double out_min, double out_max) {
return (n-in_min)*(out_max-out_min)/(in_max-in_min) + out_min;
}
}
class Size{
int width, height, length;
public Size(int w, int h) { width = w; height = h; length = h*w; }
}
class ComplexNumber {
private double real;
private double imaginary;
ComplexNumber(double a, double b) {
real = a;
imaginary = b;
}
void square() {
double new_real = Math.pow(real,2) - Math.pow(imaginary,2);
double new_imaginary = 2*real*imaginary;
this.real = new_real;
this.imaginary = new_imaginary;
}
double mod() {
return Math.sqrt(Math.pow(real,2) + Math.pow(imaginary,2));
}
void add(ComplexNumber c) {
this.real += c.real;
this.imaginary += c.imaginary;
}
}
class Scheduler {
private Size size;
private int x, y, index;
private final Object nextSync = 4;
public Scheduler(Size sz) { size = sz; }
/**
* Update the ref object with next available coords,
* return false if no more coords to be had (image is rendered)
*
* @param ref Ref object to be updated
* @return false if end of image reached
*/
public boolean next(Ref ref) {
synchronized (nextSync) {
// load passed in ref
ref.x = x;
ref.y = y;
ref.index = index;
if (++index > size.length) return false; // end of the image
// load local counters for next access
if (++x >= size.width) {
x = 0;
y++;
}
return true; // there are more pixels to be had
}
}
public void schedule(Fractal fractal, BufferedImage screen) {
for(Ref ref = new Ref(); next(ref);)
screen.setRGB(ref.x, ref.y, fractal.getPixelColor(ref));
}
}
class Ref {
public int x, y, index;
public Ref() {}
}
Here is your code rewritten to use concurrency. I found that my Lenovo Yoga misreported the number of processors by double. Also Windows 10 seems to take up an enormous amount of processing, so the results on my laptop are dubious. If you have more cores or a decent OS, it should be much better.
package au.net.qlt.canvas.test;
import javax.swing.*;
import java.awt.*;
import java.awt.image.BufferedImage;
public class TestConcurrency extends JPanel {
private BufferedImage screen;
final Fractal fractal;
private TestConcurrency(final Fractal f, Size size) {
fractal = f;
screen = new BufferedImage(size.width, size.height, BufferedImage.TYPE_INT_ARGB);
setBackground(Color.BLACK);
setPreferredSize(new Dimension(size.width,size.height));
}
public void test(boolean CONCURRENT) {
int count = CONCURRENT ? Runtime.getRuntime().availableProcessors()/2 : 1;
Scheduler scheduler = new Scheduler(fractal.size);
Thread threads = new Thread[count];
long startTime = System.currentTimeMillis();
for (int p = 0; p < count; p++) {
threads[p] = new Thread() {
public void run() {
scheduler.schedule(fractal,screen);
}
};
threads[p].start();
try {
threads[p].join();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
DEBUG("test threads: %d - elasped time: %dms", count, (System.currentTimeMillis()-startTime));
}
@Override public void paint(Graphics g) {
if(g==null) return;
g.drawImage(screen, 0,0, null);
}
public static void main(Stringargs) {
JFrame frame = new JFrame("FRACTAL");
Size size = new Size(1024, 768);
Fractal fractal = new Fractal(size);
TestConcurrency test = new TestConcurrency(fractal, size);
frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
frame.setResizable(false);
frame.add(test);
frame.pack();
frame.setLocationRelativeTo(null);
frame.setVisible(true);
for(int i=1; i<=10; i++) {
DEBUG("--- Test %d ------------------", i);
test.test(false);
test.repaint();
test.test(true);
test.repaint();
}
}
public static void DEBUG(String str, Object ... args) { Also System.out.println(String.format(str, args)); }
}
class Fractal {
ComplexNumber C;
private int maxiter;
private int blowup;
private double real;
private double imaginary;
private static double xs = -2.0, xe = 2.0, ys = -1.0, ye = 1.0;
public Size size;
Fractal(Size sz){
size = sz;
real = -0.8;
imaginary = 0.156;
C = new ComplexNumber(real, imaginary);
maxiter = 400;
blowup = 4;
}
public int getPixelColor(Ref ref) {
float hue;
float saturation = 1f;
float brightness;
double X = map(ref.x,0,size.width,xs,xe);
double Y = map(ref.y,0,size.height,ys,ye);
ComplexNumber Z = new ComplexNumber(X, Y);
int i;
for (i = 0; i < maxiter; i++) {
Z.square();
Z.add(C);
if (Z.mod() > blowup) {
break;
}
}
brightness = (i < maxiter) ? 1f : 0;
hue = (i%maxiter)/(float)maxiter;
return Color.getHSBColor(hue*5,saturation,brightness).getRGB();
}
private double map(double n, double in_min, double in_max, double out_min, double out_max) {
return (n-in_min)*(out_max-out_min)/(in_max-in_min) + out_min;
}
}
class Size{
int width, height, length;
public Size(int w, int h) { width = w; height = h; length = h*w; }
}
class ComplexNumber {
private double real;
private double imaginary;
ComplexNumber(double a, double b) {
real = a;
imaginary = b;
}
void square() {
double new_real = Math.pow(real,2) - Math.pow(imaginary,2);
double new_imaginary = 2*real*imaginary;
this.real = new_real;
this.imaginary = new_imaginary;
}
double mod() {
return Math.sqrt(Math.pow(real,2) + Math.pow(imaginary,2));
}
void add(ComplexNumber c) {
this.real += c.real;
this.imaginary += c.imaginary;
}
}
class Scheduler {
private Size size;
private int x, y, index;
private final Object nextSync = 4;
public Scheduler(Size sz) { size = sz; }
/**
* Update the ref object with next available coords,
* return false if no more coords to be had (image is rendered)
*
* @param ref Ref object to be updated
* @return false if end of image reached
*/
public boolean next(Ref ref) {
synchronized (nextSync) {
// load passed in ref
ref.x = x;
ref.y = y;
ref.index = index;
if (++index > size.length) return false; // end of the image
// load local counters for next access
if (++x >= size.width) {
x = 0;
y++;
}
return true; // there are more pixels to be had
}
}
public void schedule(Fractal fractal, BufferedImage screen) {
for(Ref ref = new Ref(); next(ref);)
screen.setRGB(ref.x, ref.y, fractal.getPixelColor(ref));
}
}
class Ref {
public int x, y, index;
public Ref() {}
}
edited Nov 22 '18 at 1:29
answered Oct 30 '18 at 21:26
daviburndaviburn
264
264
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52128631%2fparallel-and-single-threaded-code-having-relatively-same-performance-in-java%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
This link, gee.cs.oswego.edu/dl/html/StreamParallelGuidance.html is about parallel streams but the focus is on performance with FG. There have been a lot of questions on why parallel is sometimes slower and all the reasons are too numerous to mention here. Data Parallelism, en.wikipedia.org/wiki/Data_parallelism requires many processors to do properly -- 4 is a small number.
– edharned
Sep 1 '18 at 14:01
@edharned What do you mean when you say "FG"? I scanned over the document and it is a lot to take in. Although does this mean for this particular problem ( Julia Set ) it is worthless to write parallelized code?
– Rohit Sarkar
Sep 1 '18 at 14:11
1
FG is ForkJoin. Parallel Streams are based on FG. Search for "java parallel is sometimes slower than synchronous" turns up many questions/answers here on SO. Only you can answer the question of whether to parallelize or not. As I said above, Data Parallel requires spreading the work around many, many processors. Data Parallel used to be limited to Massively Parallel Processors (scientific/academic) before the 2, 4, 8, etc. core machines became popular. Run your problem on a 256 core machine and see what happens.
– edharned
Sep 1 '18 at 15:56
@edharned Thanks for the pointers. I will look it up. The reason I was doing this was I had seen a C++ video ( youtube.com/watch?v=Pc8DfEyAxzg ) related to the same topic of generating fractals where he optimized his code quite a lot by parallelizing it. I was hoping to achieve the same with Java. Hopefully I will be able to do it.
– Rohit Sarkar
Sep 1 '18 at 16:16
1
There are certainly several reasons of why you don't see a speedup. First: There are hardly any iterations done! Most of the fractal does not reach the
maxiter
limit. Useimaginary = 0.056;
andmaxiter=1000
to generate some workload. For me, the single threaded version then takes ~1500ms, and the parallel one ~800ms. Further optimizations would probably be possible. I've seen the "multicore" branch: My gut feeling is that simply tiling the image into maybe 10x10 tiles and throwing them into a Threadpool ExecutorService could be faster (but haven't investigated it in detail)– Marco13
Oct 30 '18 at 22:06