Java file di grandi dimensioni su disco IO Prestazioni

Ho due (2 GB ciascuno) file sul mio disco rigido e voglio confrontarli tra loro:Java file di grandi dimensioni su disco IO Prestazioni

Copiare i file originali con Windows Explorer dura circa. 2-4 minuti (cioè lettura e scrittura - sullo stesso disco fisico e logico).
La lettura con java.io.FileInputStream due volte e il confronto degli array di byte su una base di byte per byte richiede più di 20 minuti.
java.io.BufferedInputStream buffer è 64kb, i file vengono letti in blocchi e quindi confrontati.

confronto è fatto è un ciclo stretto come

int numRead = Math.min(numRead[0], numRead[1]); 
for (int k = 0; k < numRead; k++) 
{ 
    if (buffer[1][k] != buffer[0][k]) 
    { 
     return buffer[0][k] - buffer[1][k]; 
    } 
}

Che cosa posso fare per accelerare questo? Si suppone che NIO sia più veloce dei normali flussi? Java non è in grado di utilizzare le tecnologie DMA/SATA e fa invece delle chiamate lente all'OS-API?

EDIT:
Grazie per le risposte. Ho fatto alcuni esperimenti basati su di loro. Come ha mostrato Andreas

flussi o gli approcci nio non differiscono molto.
Più importante è la dimensione del buffer corretta.

Ciò è confermato dai miei stessi esperimenti. Poiché i file vengono letti in grandi blocchi, anche i buffer aggiuntivi (BufferedInputStream) non danno nulla. Ottimizzare il confronto è possibile e ho ottenuto i migliori risultati con lo srotolamento di 32 volte, ma il tempo trascorso a confronto è piccolo rispetto alla lettura del disco, quindi la velocità è ridotta. Sembra che non c'è niente che io possa fare ;-(

fonte

2009-06-08 Peter Kofler

Nota: per impostazione predefinita, le tecnologie DMA/SATA sono gestite dal sistema operativo per tutte le operazioni di I/O su file (bene, su sistemi operativi moderni). –

Ho provato tre diversi metodi per confrontare due identici file da 3,8 GB con buffer di dimensioni comprese tra 8 kb e 1 MB. il primo primo metodo utilizzato solo due ingressi tamponata flussi

il secondo approccio utilizza un threadpool che legge due fili diversi e confronta una terza. questo ha un throughput leggermente più alto a scapito di un elevato utilizzo della CPU. la gestione del threadpool richiede molto overhead con quelle attività di breve durata.

il terzo approccio utilizza nio, come postato da laginimaineb

come si può vedere, l'approccio generale non si discosta molto. più importante è la dimensione del buffer corretta.

ciò che è strano che leggo 1 byte in meno utilizzando thread. non riuscivo a individuare l'errore difficile.

comparing just with two streams 
I was equal, even after 3684070360 bytes and reading for 704813 ms (4,98MB/sec * 2) with a buffer size of 8 kB 
I was equal, even after 3684070360 bytes and reading for 578563 ms (6,07MB/sec * 2) with a buffer size of 16 kB 
I was equal, even after 3684070360 bytes and reading for 515422 ms (6,82MB/sec * 2) with a buffer size of 32 kB 
I was equal, even after 3684070360 bytes and reading for 534532 ms (6,57MB/sec * 2) with a buffer size of 64 kB 
I was equal, even after 3684070360 bytes and reading for 422953 ms (8,31MB/sec * 2) with a buffer size of 128 kB 
I was equal, even after 3684070360 bytes and reading for 793359 ms (4,43MB/sec * 2) with a buffer size of 256 kB 
I was equal, even after 3684070360 bytes and reading for 746344 ms (4,71MB/sec * 2) with a buffer size of 512 kB 
I was equal, even after 3684070360 bytes and reading for 669969 ms (5,24MB/sec * 2) with a buffer size of 1024 kB 
comparing with threads 
I was equal, even after 3684070359 bytes and reading for 602391 ms (5,83MB/sec * 2) with a buffer size of 8 kB 
I was equal, even after 3684070359 bytes and reading for 523156 ms (6,72MB/sec * 2) with a buffer size of 16 kB 
I was equal, even after 3684070359 bytes and reading for 527547 ms (6,66MB/sec * 2) with a buffer size of 32 kB 
I was equal, even after 3684070359 bytes and reading for 276750 ms (12,69MB/sec * 2) with a buffer size of 64 kB 
I was equal, even after 3684070359 bytes and reading for 493172 ms (7,12MB/sec * 2) with a buffer size of 128 kB 
I was equal, even after 3684070359 bytes and reading for 696781 ms (5,04MB/sec * 2) with a buffer size of 256 kB 
I was equal, even after 3684070359 bytes and reading for 727953 ms (4,83MB/sec * 2) with a buffer size of 512 kB 
I was equal, even after 3684070359 bytes and reading for 741000 ms (4,74MB/sec * 2) with a buffer size of 1024 kB 
comparing with nio 
I was equal, even after 3684070360 bytes and reading for 661313 ms (5,31MB/sec * 2) with a buffer size of 8 kB 
I was equal, even after 3684070360 bytes and reading for 656156 ms (5,35MB/sec * 2) with a buffer size of 16 kB 
I was equal, even after 3684070360 bytes and reading for 491781 ms (7,14MB/sec * 2) with a buffer size of 32 kB 
I was equal, even after 3684070360 bytes and reading for 317360 ms (11,07MB/sec * 2) with a buffer size of 64 kB 
I was equal, even after 3684070360 bytes and reading for 643078 ms (5,46MB/sec * 2) with a buffer size of 128 kB 
I was equal, even after 3684070360 bytes and reading for 865016 ms (4,06MB/sec * 2) with a buffer size of 256 kB 
I was equal, even after 3684070360 bytes and reading for 716796 ms (4,90MB/sec * 2) with a buffer size of 512 kB 
I was equal, even after 3684070360 bytes and reading for 652016 ms (5,39MB/sec * 2) with a buffer size of 1024 kB

il codice utilizzato:

import junit.framework.Assert; 
import org.junit.Before; 
import org.junit.Test; 

import java.io.BufferedInputStream; 
import java.io.File; 
import java.io.FileInputStream; 
import java.io.IOException; 
import java.nio.ByteBuffer; 
import java.nio.channels.FileChannel; 
import java.text.DecimalFormat; 
import java.text.NumberFormat; 
import java.util.Arrays; 
import java.util.concurrent.*; 

public class FileCompare { 

    private static final int MIN_BUFFER_SIZE = 1024 * 8; 
    private static final int MAX_BUFFER_SIZE = 1024 * 1024; 
    private String fileName1; 
    private String fileName2; 
    private long start; 
    private long totalbytes; 

    @Before 
    public void createInputStream() { 
     fileName1 = "bigFile.1"; 
     fileName2 = "bigFile.2"; 
    } 

    @Test 
    public void compareTwoFiles() throws IOException { 
     System.out.println("comparing just with two streams"); 
     int currentBufferSize = MIN_BUFFER_SIZE; 
     while (currentBufferSize <= MAX_BUFFER_SIZE) { 
      compareWithBufferSize(currentBufferSize); 
      currentBufferSize *= 2; 
     } 
    } 

    @Test 
    public void compareTwoFilesFutures() 
      throws IOException, ExecutionException, InterruptedException { 
     System.out.println("comparing with threads"); 
     int myBufferSize = MIN_BUFFER_SIZE; 
     while (myBufferSize <= MAX_BUFFER_SIZE) { 
      start = System.currentTimeMillis(); 
      totalbytes = 0; 
      compareWithBufferSizeFutures(myBufferSize); 
      myBufferSize *= 2; 
     } 
    } 

    @Test 
    public void compareTwoFilesNio() throws IOException { 
     System.out.println("comparing with nio"); 
     int myBufferSize = MIN_BUFFER_SIZE; 
     while (myBufferSize <= MAX_BUFFER_SIZE) { 
      start = System.currentTimeMillis(); 
      totalbytes = 0; 
      boolean wasEqual = isEqualsNio(myBufferSize); 

      if (wasEqual) { 
       printAfterEquals(myBufferSize); 
      } else { 
       Assert.fail("files were not equal"); 
      } 

      myBufferSize *= 2; 
     } 

    } 

    private void compareWithBufferSize(int myBufferSize) throws IOException { 
     final BufferedInputStream inputStream1 = 
       new BufferedInputStream(
         new FileInputStream(new File(fileName1)), 
         myBufferSize); 
     byte[] buff1 = new byte[myBufferSize]; 
     final BufferedInputStream inputStream2 = 
       new BufferedInputStream(
         new FileInputStream(new File(fileName2)), 
         myBufferSize); 
     byte[] buff2 = new byte[myBufferSize]; 
     int read1; 

     start = System.currentTimeMillis(); 
     totalbytes = 0; 
     while ((read1 = inputStream1.read(buff1)) != -1) { 
      totalbytes += read1; 
      int read2 = inputStream2.read(buff2); 
      if (read1 != read2) { 
       break; 
      } 
      if (!Arrays.equals(buff1, buff2)) { 
       break; 
      } 
     } 
     if (read1 == -1) { 
      printAfterEquals(myBufferSize); 
     } else { 
      Assert.fail("files were not equal"); 
     } 
     inputStream1.close(); 
     inputStream2.close(); 
    } 

    private void compareWithBufferSizeFutures(int myBufferSize) 
      throws ExecutionException, InterruptedException, IOException { 
     final BufferedInputStream inputStream1 = 
       new BufferedInputStream(
         new FileInputStream(
           new File(fileName1)), 
         myBufferSize); 
     final BufferedInputStream inputStream2 = 
       new BufferedInputStream(
         new FileInputStream(
           new File(fileName2)), 
         myBufferSize); 

     final boolean wasEqual = isEqualsParallel(myBufferSize, inputStream1, inputStream2); 

     if (wasEqual) { 
      printAfterEquals(myBufferSize); 
     } else { 
      Assert.fail("files were not equal"); 
     } 
     inputStream1.close(); 
     inputStream2.close(); 
    } 

    private boolean isEqualsParallel(int myBufferSize 
      , final BufferedInputStream inputStream1 
      , final BufferedInputStream inputStream2) 
      throws InterruptedException, ExecutionException { 
     final byte[] buff1Even = new byte[myBufferSize]; 
     final byte[] buff1Odd = new byte[myBufferSize]; 
     final byte[] buff2Even = new byte[myBufferSize]; 
     final byte[] buff2Odd = new byte[myBufferSize]; 
     final Callable<Integer> read1Even = new Callable<Integer>() { 
      public Integer call() throws Exception { 
       return inputStream1.read(buff1Even); 
      } 
     }; 
     final Callable<Integer> read2Even = new Callable<Integer>() { 
      public Integer call() throws Exception { 
       return inputStream2.read(buff2Even); 
      } 
     }; 
     final Callable<Integer> read1Odd = new Callable<Integer>() { 
      public Integer call() throws Exception { 
       return inputStream1.read(buff1Odd); 
      } 
     }; 
     final Callable<Integer> read2Odd = new Callable<Integer>() { 
      public Integer call() throws Exception { 
       return inputStream2.read(buff2Odd); 
      } 
     }; 
     final Callable<Boolean> oddEqualsArray = new Callable<Boolean>() { 
      public Boolean call() throws Exception { 
       return Arrays.equals(buff1Odd, buff2Odd); 
      } 
     }; 
     final Callable<Boolean> evenEqualsArray = new Callable<Boolean>() { 
      public Boolean call() throws Exception { 
       return Arrays.equals(buff1Even, buff2Even); 
      } 
     }; 

     ExecutorService executor = Executors.newCachedThreadPool(); 
     boolean isEven = true; 
     Future<Integer> read1 = null; 
     Future<Integer> read2 = null; 
     Future<Boolean> isEqual = null; 
     int lastSize = 0; 
     while (true) { 
      if (isEqual != null) { 
       if (!isEqual.get()) { 
        return false; 
       } else if (lastSize == -1) { 
        return true; 
       } 
      } 
      if (read1 != null) { 
       lastSize = read1.get(); 
       totalbytes += lastSize; 
       final int size2 = read2.get(); 
       if (lastSize != size2) { 
        return false; 
       } 
      } 
      isEven = !isEven; 
      if (isEven) { 
       if (read1 != null) { 
        isEqual = executor.submit(oddEqualsArray); 
       } 
       read1 = executor.submit(read1Even); 
       read2 = executor.submit(read2Even); 
      } else { 
       if (read1 != null) { 
        isEqual = executor.submit(evenEqualsArray); 
       } 
       read1 = executor.submit(read1Odd); 
       read2 = executor.submit(read2Odd); 
      } 
     } 
    } 

    private boolean isEqualsNio(int myBufferSize) throws IOException { 
     FileChannel first = null, seconde = null; 
     try { 
      first = new FileInputStream(fileName1).getChannel(); 
      seconde = new FileInputStream(fileName2).getChannel(); 
      if (first.size() != seconde.size()) { 
       return false; 
      } 
      ByteBuffer firstBuffer = ByteBuffer.allocateDirect(myBufferSize); 
      ByteBuffer secondBuffer = ByteBuffer.allocateDirect(myBufferSize); 
      int firstRead, secondRead; 
      while (first.position() < first.size()) { 
       firstRead = first.read(firstBuffer); 
       totalbytes += firstRead; 
       secondRead = seconde.read(secondBuffer); 
       if (firstRead != secondRead) { 
        return false; 
       } 
       if (!nioBuffersEqual(firstBuffer, secondBuffer, firstRead)) { 
        return false; 
       } 
      } 
      return true; 
     } finally { 
      if (first != null) { 
       first.close(); 
      } 
      if (seconde != null) { 
       seconde.close(); 
      } 
     } 
    } 

    private static boolean nioBuffersEqual(ByteBuffer first, ByteBuffer second, final int length) { 
     if (first.limit() != second.limit() || length > first.limit()) { 
      return false; 
     } 
     first.rewind(); 
     second.rewind(); 
     for (int i = 0; i < length; i++) { 
      if (first.get() != second.get()) { 
       return false; 
      } 
     } 
     return true; 
    } 

    private void printAfterEquals(int myBufferSize) { 
     NumberFormat nf = new DecimalFormat("#.00"); 
     final long dur = System.currentTimeMillis() - start; 
     double seconds = dur/1000d; 
     double megabytes = totalbytes/1024/1024; 
     double rate = (megabytes)/seconds; 
     System.out.println("I was equal, even after " + totalbytes 
       + " bytes and reading for " + dur 
       + " ms (" + nf.format(rate) + "MB/sec * 2)" + 
       " with a buffer size of " + myBufferSize/1024 + " kB"); 
    } 
}

fonte

2009-06-11 10:20:30

+1. Bel lavoro, Andreas. Potrei disturbarti a eseguirlo con un buffer da 64 MB (sì, megabyte) sugli stessi dati e sulla stessa macchina? @alamar pensa che questo in qualche modo fornirà risultati magicamente eccellenti a causa di una mancanza di ricerca, di cui sono scettico, avendo un'esperienza del mondo reale più vicina ai tuoi risultati. –

Anche i miei esperimenti hanno dimostrato che le dimensioni del buffer 64kb/128kb sono ottimali, come nei test. Per la lettura di un byte [] di 64kb in un solo passaggio, non è importante se utilizzo BufferedInputStream su FileInputStream o meno, essi eseguono lo stesso. Anche se ho avuto problemi perché dopo che il file è stato letto una volta, i tempi diventano più piccoli a causa del caching del disco in modo appropriato. –

Con tali file di grandi dimensioni, che si sta per ottenere prestazioni molto migliori con java.nio.

Inoltre, la lettura di byte singoli con flussi di Java può essere molto lento. Utilizzando un array di byte (2-6K elementi dalle mie esperienze personali, ymmv come sembra piattaforma/applicazione specifica) migliorerà notevolmente le prestazioni di lettura con i flussi

fonte

2009-06-08 11:00:29

ero "spaventato" da quello. il codice è piuttosto vecchio e funziona bene da molto tempo, ma i file tendono a crescere sempre ... –

Se si utilizza un MappedByteBuffer (che utilizza il sottosistema di paging della memoria virtuale del sistema operativo), è possibile ridurre al minimo le modifiche al codice e ancora ottenere miglioramenti sostanziali nella velocità. Mi azzarderei a indovinare "ordini di grandezza" più velocemente. –

NIO può essere fonte di confusione, si vorrà iniziare con ByteBuffer.allocateDirect() per ottenere le massime prestazioni (utilizza file mappati in memoria a quel punto). http://java.sun.com/javase/6/docs/api/java/nio/ByteBuffer.html – AgileJon

Leggere e scrivere i file con Java può essere altrettanto veloce. utilizzare FileChannels. Per quanto riguarda il confronto dei file, ovviamente ci vorrà molto tempo c omparing byte per byte Ecco un esempio utilizzando FileChannels e ByteBuffers (potrebbe essere ulteriormente ottimizzato):

public static boolean compare(String firstPath, String secondPath, final int BUFFER_SIZE) throws IOException { 
    FileChannel firstIn = null, secondIn = null; 
    try { 
     firstIn = new FileInputStream(firstPath).getChannel(); 
     secondIn = new FileInputStream(secondPath).getChannel(); 
     if (firstIn.size() != secondIn.size()) 
      return false; 
     ByteBuffer firstBuffer = ByteBuffer.allocateDirect(BUFFER_SIZE); 
     ByteBuffer secondBuffer = ByteBuffer.allocateDirect(BUFFER_SIZE); 
     int firstRead, secondRead; 
     while (firstIn.position() < firstIn.size()) { 
      firstRead = firstIn.read(firstBuffer); 
      secondRead = secondIn.read(secondBuffer); 
      if (firstRead != secondRead) 
       return false; 
      if (!buffersEqual(firstBuffer, secondBuffer, firstRead)) 
       return false; 
     } 
     return true; 
    } finally { 
     if (firstIn != null) firstIn.close(); 
     if (secondIn != null) firstIn.close(); 
    } 
} 

private static boolean buffersEqual(ByteBuffer first, ByteBuffer second, final int length) { 
    if (first.limit() != second.limit()) 
     return false; 
    if (length > first.limit()) 
     return false; 
    first.rewind(); second.rewind(); 
    for (int i=0; i<length; i++) 
     if (first.get() != second.get()) 
      return false; 
    return true; 
}

fonte

2009-06-08 11:01:48 laginimaineb

avete qualche idea sul confronto più veloce di byte per byte? –

Bene ... Come ho detto, è possibile utilizzare FileChannels (e ByteBuffers). Posso confrontare due file da 1,6 GB in 60 secondi. Ho modificato il mio post originale per includere il codice che uso. – laginimaineb

mi piace questo esempio. non è necessario leggere i file interi negli array per confrontarli. altrimenti, si perde molto tempo a leggere i file che potrebbero essere uguali al byte 1, invece di leggere 2 byte. –

DMA/SATA sono techlonogies/basso livello hardware e non sono visibili su qualsiasi linguaggio di programmazione di sorta.

Per input/output mappati in memoria, è necessario utilizzare java.nio, credo.

Sei sicuro che non stai leggendo quei file di un byte? Sarebbe uno spreco, raccomanderei di farlo blocco per blocco, e ogni blocco dovrebbe essere qualcosa come 64 megabyte per minimizzare la ricerca.

fonte

2009-06-08 11:03:07 alamar

vuoi dire 64kb, sì? non megabyte? –

Perché, megabyte, se te lo puoi permettere (puoi farlo oggi). Leggere due file di 64 kilobyte non è una buona idea IMO perché l'unità cercherebbe senza sosta. – alamar

Oh, pensavo fosse un tipo. Questo mi sembra un'ottimizzazione prematura. Mi domando un valore così grande perché penso che la lettura bloccherà fino a quando non verrà letto l'intero 64 MB, con un conseguente rallentamento delle prestazioni complessive. Solo le metriche di performance effettive mostreranno in modo conclusivo, ma sono molto scettico nei confronti della tua teoria. –

Puoi dare un'occhiata a Suns Article for I/O Tuning (anche se un po 'datato), forse puoi trovare somiglianze tra gli esempi lì e il tuo codice.Dai anche un'occhiata al pacchetto java.nio che contiene elementi I/O più veloci di java.io. Il Dr. Dobbs Journal ha un articolo molto carino su high performance IO using java.nio.

In tal caso, sono disponibili ulteriori esempi e suggerimenti di ottimizzazione che dovrebbero essere in grado di aiutarvi ad accelerare il codice.

Inoltre la classe Array ha methods for comparing byte arrays incorporato, forse questi possono anche essere usati per rendere le cose più veloci e chiarire un po 'il ciclo.

fonte

2009-06-08 11:06:28 Kosi2801

buona idea con classe Arrays. Sotto il cofano sta facendo un confronto a byte in un circuito chiuso. Non è veloce come il ciclo continuo a 32 buste che sto usando attualmente, ma abbrevierà considerevolmente il codice, esp. per testare le prestazioni IO. –

Quanto segue è un buon articolo sui meriti relativi dei diversi modi di leggere un file in java. Può essere di qualche utilità:

How to read files quickly

fonte

2009-06-08 11:08:02

Grazie Miliardi per il collegamento !!!!!!!!!! – Ahamed

Per una migliore comparazione provare a copiare due file contemporaneamente. Un disco rigido può leggere un file in modo molto più efficiente rispetto alla lettura di due (come la testa deve spostarsi avanti e indietro per leggere) Un modo per ridurre questo è utilizzare buffer più grandi, ad es. 16 MB. con ByteBuffer.

Con ByteBuffer è possibile confrontare 8-byte alla volta confrontando i valori di lunghezza con getLong()

Se il Java è efficiente, la maggior parte del lavoro è nel disco/OS per la lettura e la scrittura in modo Non dovrebbe essere molto più lento rispetto all'utilizzo di qualsiasi altra lingua (poiché il disco/OS è il collo di bottiglia)

Non dare per scontato che Java sia lento fino a quando non si è determinato un errore nel codice.

fonte

2009-06-09 06:32:19

Mi chiedo di confrontare i lunghi perché devono essere costruiti al volo. Non dovrebbe essere più veloce srotolare il loop 8 volte? (o 32 volte che sembrava essere ottimale nei miei esperimenti). Sono d'accordo che "select is not broken", quindi IO sarà veloce, ma so che dal passato (e forse non è più vero) che Java IO è/era molto più lento di dire Pascal/C IO. Ma poiché la maggior parte delle app contiene più di un semplice IO, al giorno d'oggi Java è ancora più veloce. –

Per ByteBuffer diretti, i long non sono costruiti in Java. È solo una chiamata a JNI (potrebbero essere costruite in codice C) Sono piuttosto sicuro che sia più veloce di una chiamata per byte. In una risposta successiva, dimostro che questo funziona @ 74.8 MB/s leggendo due file dallo stesso disco che potrebbero essere abbastanza veloci. –

@Peter Lawrey: hai testato quei grandi buffer? Sei stato tu [mi hai detto] (http://stackoverflow.com/a/11610367/581205) che il sistema operativo può precaricare più file e sembra che usi internamente enormi buffer quando lo fa. – maaartinus

Dopo aver modificato il tuo NIO confrontare funzione ottengo i seguenti risultati.

I was equal, even after 4294967296 bytes and reading for 304594 ms (13.45MB/sec * 2) with a buffer size of 1024 kB 
I was equal, even after 4294967296 bytes and reading for 225078 ms (18.20MB/sec * 2) with a buffer size of 4096 kB 
I was equal, even after 4294967296 bytes and reading for 221351 ms (18.50MB/sec * 2) with a buffer size of 16384 kB

Nota: questo significa che i file vengono letti a una velocità di 37 MB/s

esecuzione la stessa cosa su un disco più veloce

I was equal, even after 4294967296 bytes and reading for 178087 ms (23.00MB/sec * 2) with a buffer size of 1024 kB 
I was equal, even after 4294967296 bytes and reading for 119084 ms (34.40MB/sec * 2) with a buffer size of 4096 kB 
I was equal, even after 4294967296 bytes and reading for 109549 ms (37.39MB/sec * 2) with a buffer size of 16384 kB

Nota: questo significa che i file sono essere letto ad un tasso di 74.8 MB/s

private static boolean nioBuffersEqual(ByteBuffer first, ByteBuffer second, final int length) { 
    if (first.limit() != second.limit() || length > first.limit()) { 
     return false; 
    } 
    first.rewind(); 
    second.rewind(); 
    int i; 
    for (i = 0; i < length-7; i+=8) { 
     if (first.getLong() != second.getLong()) { 
      return false; 
     } 
    } 
    for (; i < length; i++) { 
     if (first.get() != second.get()) { 
      return false; 
     } 
    } 
    return true; 
}

fonte

2009-06-12 06:58:44

-1

Provare a impostare il buffer sul flusso di input fino a diversi megabyte.

fonte

2009-06-13 13:47:10

Ho scoperto che molti degli articoli collegati in questo post sono davvero obsoleti (ci sono anche cose molto interessanti). Ci sono alcuni articoli collegati dal 2001, e l'informazione è discutibile nel migliore dei casi. Martin Thompson di simpatia meccanica ha scritto un bel po 'su questo nel 2011. Si prega di fare riferimento a ciò che ha scritto per lo sfondo e la teoria di questo.

Ho trovato che NIO o non NIO ha molto poco a che fare con le prestazioni. È molto più sulla dimensione dei tuoi buffer di uscita (leggi array di byte su quello). NIO non è magia, fallo con una salsa su scala web veloce.

Sono stato in grado di prendere gli esempi di Martin e utilizzare l'OutputStream dell'era 1.0 e farlo urlare. Anche NIO è veloce, ma l'indicatore più grande è solo la dimensione del buffer di uscita, indipendentemente dal fatto che tu usi o meno NIO, a meno che, naturalmente, non si stia utilizzando un NIO mappato in memoria. :)

Se si vuole aggiornate informazioni autorevoli su questo, si veda il blog di Martin:

http://mechanical-sympathy.blogspot.com/2011/12/java-sequential-io-performance.html

Se volete vedere come NIO non fa che molta differenza (come ho potuto esempi di scrivere utilizzando normale iO che erano più veloci) vedere questo:

http://www.dzone.com/links/fast_java_io_nio_is_always_faster_than_fileoutput.html

ho testato la mia ipotesi sulle nuove finestre computer portatile con un disco rigido veloce, il mio MacBook Pro con SSD, un EC2 xlarge e un EC2 4x di grandi dimensioni con IOPS/I/O ad alta velocità (e presto su un disco NAS in fibra di grandi dimensioni) quindi funziona (ci sono alcuni problemi con le istanze EC2 più piccole, ma se ti interessano le prestazioni. .. hai intenzione di utilizzare una piccola istanza EC2?). Se si utilizza l'hardware reale, nei miei test fino ad ora, l'IO tradizionale vince sempre. Se si utilizza EC2 alto/IO, anche questo è un chiaro vincitore. Se si utilizzano le istanze EC2 alimentate, NIO può vincere.

Nessuna sostituzione per il benchmarking.

In ogni caso, non sono un esperto, ho appena fatto alcuni test empirici usando il framework che Sir Martin Thompson ha scritto nel suo post sul blog.

ho preso questo per il passo successivo e utilizzato Files.newInputStream (da JDK 7) con TransferQueue per creare una ricetta per fare Java/O urlo I (anche su istanze piccole EC2). La ricetta può essere trovata in fondo a questa documentazione per Boon (https://github.com/RichardHightower/boon/wiki/Auto-Growable-Byte-Buffer-like-a-ByteBuilder). Ciò mi consente di utilizzare un OutputStream tradizionale ma con qualcosa che funziona bene su istanze EC2 più piccole. (Io sono l'autore principale della Boon. Ma sto accettando nuovi autori. La paga fa schifo. 0 $ all'ora. Ma la buona notizia è, posso raddoppiare la paga ogni volta che vuoi.)

I miei 2 centesimi.

Vedere per vedere perché TransferQueue è importante.http://php.sabscape.com/blog/?p=557

apprendimenti fondamentali:

Se vi preoccupate per prestazioni mai, mai, mai utilizzano BufferedOutputStream.
NIO non sempre equivale a prestazioni.
Le dimensioni del buffer sono le più importanti.
Il recupero dei buffer per le scritture ad alta velocità è fondamentale.
GC può/desidera/implode le prestazioni per le scritture ad alta velocità.
È necessario disporre di un meccanismo per riutilizzare i buffer esauriti.

fonte

2013-11-09 22:48:39 RickHigh

Java file di grandi dimensioni su disco IO Prestazioni

risposta

Problemi correlati