Benchmark in ambiente multi thread

Stavo imparando il multi thread e ho trovato rallentamento di Object.hashCode in ambiente multi-thread poiché richiede il doppio del tempo per calcolare il codice hash predefinito che esegue 4 thread contro 1 thread per lo stesso numero di oggetti.Benchmark in ambiente multi thread

Ma secondo la mia comprensione, dovrebbe impiegare un tempo simile a farlo in parallelo.

È possibile modificare il numero di thread. Ogni thread ha la stessa quantità di lavoro da fare, quindi speri che l'esecuzione di 4 thread su un mio computer che è una macchina quad-core possa richiedere lo stesso tempo di un singolo thread.

Sto vedendo ~ 2,3 secondi per 4x ma .9 s per 1x.

C'è qualche lacuna nella mia comprensione, per favore aiutami a capire questo comportamento.

import java.util.Arrays; 
import java.util.List; 
import java.util.concurrent.Callable; 
import java.util.concurrent.ExecutorService; 
import java.util.concurrent.Executors; 
import java.util.concurrent.Future; 
import java.util.concurrent.ThreadFactory; 

public class ObjectHashCodePerformance { 

private static final int THREAD_COUNT = 4; 
private static final int ITERATIONS = 20000000; 

public static void main(final String[] args) throws Exception { 
    long start = System.currentTimeMillis(); 
    new ObjectHashCodePerformance().run(); 
    System.err.println(System.currentTimeMillis() - start); 
} 

private final ExecutorService _sevice = Executors.newFixedThreadPool(THREAD_COUNT, 
     new ThreadFactory() { 
      private final ThreadFactory _delegate = Executors.defaultThreadFactory(); 

      @Override 
      public Thread newThread(final Runnable r) { 
       Thread thread = _delegate.newThread(r); 
       thread.setDaemon(true); 
       return thread; 
      } 
     }); 

    private void run() throws Exception { 
    Callable<Void> work = new java.util.concurrent.Callable<Void>() { 
     @Override 
     public Void call() throws Exception { 
      for (int i = 0; i < ITERATIONS; i++) { 
       Object object = new Object(); 
       object.hashCode(); 
      } 
      return null; 
     } 
    }; 
    @SuppressWarnings("unchecked") 
    Callable<Void>[] allWork = new Callable[THREAD_COUNT]; 
    Arrays.fill(allWork, work); 
    List<Future<Void>> futures = _sevice.invokeAll(Arrays.asList(allWork)); 
    for (Future<Void> future : futures) { 
     future.get(); 
    } 
} 

}

Per numero di thread 4 di uscita è

~2.3 seconds

Per numero di thread 1 uscita è

~.9 seconds

fonte

2015-12-16 Show Stopper

Si prega di condividere le modifiche apportate tra 1 e 4 thread – Jan

La misurazione del tempo non significa necessariamente che si dice molto qui. Vedi http://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java – Marco13

Probabilmente non stai misurando la cosa giusta: GC, creazione degli esecutori e dei suoi fili, coordinazione del filo, istanze dell'oggetto, allocazioni di memoria, ecc. ecc.Ad ogni modo, il beanchmark è piuttosto inutile, dal momento che non è possibile modificare nulla sull'implementazione hashCode() di Object. –

ho creato un semplice punto di riferimento JMH per testare i vari casi:

@Fork(1) 
@State(Scope.Benchmark) 
@OutputTimeUnit(TimeUnit.NANOSECONDS) 
@Measurement(iterations = 10) 
@Warmup(iterations = 10) 
@BenchmarkMode(Mode.AverageTime) 
public class HashCodeBenchmark { 
    private final Object object = new Object(); 

    @Benchmark 
    @Threads(1) 
    public void singleThread(Blackhole blackhole){ 
     blackhole.consume(object.hashCode()); 
    } 

    @Benchmark 
    @Threads(2) 
    public void twoThreads(Blackhole blackhole){ 
     blackhole.consume(object.hashCode()); 
    } 

    @Benchmark 
    @Threads(4) 
    public void fourThreads(Blackhole blackhole){ 
     blackhole.consume(object.hashCode()); 
    } 

    @Benchmark 
    @Threads(8) 
    public void eightThreads(Blackhole blackhole){ 
     blackhole.consume(object.hashCode()); 
    } 
}

E i risultati sono i seguenti:

Benchmark      Mode Cnt Score Error Units 
HashCodeBenchmark.eightThreads avgt 10 5.710 ± 0.087 ns/op 
HashCodeBenchmark.fourThreads avgt 10 3.603 ± 0.169 ns/op 
HashCodeBenchmark.singleThread avgt 10 3.063 ± 0.011 ns/op 
HashCodeBenchmark.twoThreads avgt 10 3.067 ± 0.034 ns/op

Così possiamo vedere che fino a quando non ci sono più thread di core, il tempo per ogni codice hash rimane lo stesso.

PS: come aveva commentato @Tom Cools - si sta misurando la velocità di allocazione e non la velocità hashCode() nel test.

fonte

2015-12-16 14:39:54

Grazie per l'analisi ... :) –

Puoi dirlo per favore ... a proposito dello strumento usato per benchmark –

si chiama JMH: http://openjdk.java.net/projects/code-tools/jmh/ –

commento di See Palamino:

Non stai misurando hashCode(), stai misurando l'istanziazione di 20 milioni di oggetti con un singolo thread e 80 milioni di oggetti con 4 in esecuzione filettature. Spostare la logica new Object() fuori dal ciclo for nella vostra Callable, allora si sarà misurando hashCode() - Palamino

fonte

2015-12-16 13:53:36

Ha detto che è possibile modificare il numero di thread per osservare il problema che ha descritto – Marco13

Ho spostato lo stesso risultato .. :( –

Due problema che vedo con il codice:

Le dimensioni Allwork array [] pari a iterazioni.
E mentre si itera, nel metodo call() assicurarsi che ogni thread abbia la sua parte di carico. ITERAZIONI/THREAD_COUNT.

Di seguito si riporta la versione modificata si può provare:

import java.util.Arrays; 
import java.util.List; 
import java.util.concurrent.Callable; 
import java.util.concurrent.CountDownLatch; 
import java.util.concurrent.ExecutorService; 
import java.util.concurrent.Executors; 
import java.util.concurrent.Future; 
import java.util.concurrent.ThreadFactory; 

public class ObjectHashCodePerformance { 

private static final int THREAD_COUNT = 1; 
private static final int ITERATIONS = 20000; 
private final Object object = new Object(); 

public static void main(final String[] args) throws Exception { 
    long start = System.currentTimeMillis(); 
    new ObjectHashCodePerformance().run(); 
    System.err.println(System.currentTimeMillis() - start); 
} 

private final ExecutorService _sevice = Executors.newFixedThreadPool(THREAD_COUNT, 
     new ThreadFactory() { 
      private final ThreadFactory _delegate = Executors.defaultThreadFactory(); 

      @Override 
      public Thread newThread(final Runnable r) { 
       Thread thread = _delegate.newThread(r); 
       thread.setDaemon(true); 
       return thread; 
      } 
     }); 

    private void run() throws Exception { 
    Callable<Void> work = new java.util.concurrent.Callable<Void>() { 
     @Override 
     public Void call() throws Exception { 
      for (int i = 0; i < ITERATIONS/THREAD_COUNT; i++) { 
       object.hashCode(); 
      } 
      return null; 
     } 
    }; 
    @SuppressWarnings("unchecked") 
    Callable<Void>[] allWork = new Callable[ITERATIONS]; 
    Arrays.fill(allWork, work); 
    List<Future<Void>> futures = _sevice.invokeAll(Arrays.asList(allWork)); 
    System.out.println("Futures size : " + futures.size()); 
    for (Future<Void> future : futures) { 
     future.get(); 
    } 
} 

}

fonte

2015-12-16 14:41:18

nel metodo 'run()/call()' stai ancora assegnando gli oggetti - quindi stai misurando il codice hash più la velocità di allocazione. La tua risposta è difettosa. –

@SvetlinZarev Punto preso e aggiornato il codice. –

Benchmark in ambiente multi thread

risposta

Problemi correlati