2012-06-22 16 views
14

La domanda: Perché le prestazioni di MySQL diminuiscono per le query che uniscono tabelle quasi vuote se eseguite in parallelo?Perché le prestazioni di MySQL diminuiscono quando le query vengono eseguite in parallelo?

Di seguito è fornita una spiegazione più dettagliata del problema che sto affrontando. Ho due tabelle in MySQL

CREATE TABLE first (
    num int(10) NOT NULL, 
    UNIQUE KEY key_num (num) 
) ENGINE=InnoDB 

CREATE TABLE second (
    num int(10) NOT NULL, 
    num2 int(10) NOT NULL, 
    UNIQUE KEY key_num (num, num2) 
) ENGINE=InnoDB 

Il primo contiene circa un migliaio di record. Il secondo è vuoto o contiene pochissimi record. Contiene anche un doppio indice che in qualche modo si riferisce al problema: il problema scompare per singolo indice. Ora sto cercando di fare molte query identiche a quelle tabelle in parallelo. Ogni query è simile al seguente:

SELECT first.num 
FROM first 
LEFT JOIN second AS second_1 ON second_1.num = -1 # non-existent key 
LEFT JOIN second AS second_2 ON second_2.num = -2 # non-existent key 
LEFT JOIN second AS second_3 ON second_3.num = -3 # non-existent key 
LEFT JOIN second AS second_4 ON second_4.num = -4 # non-existent key 
LEFT JOIN second AS second_5 ON second_5.num = -5 # non-existent key 
LEFT JOIN second AS second_6 ON second_6.num = -6 # non-existent key 
WHERE second_1.num IS NULL 
    AND second_2.num IS NULL 
    AND second_3.num IS NULL 
    AND second_4.num IS NULL 
    AND second_5.num IS NULL 
    AND second_6.num IS NULL 

Il problema che sto ricevendo è che invece di avere un aumento quasi lineare in termini di prestazioni su 8 core macchina Io in realtà sono una goccia. Vale a dire avere un processo, il numero tipico di richieste al secondo che ho è di circa 200. Avendo due processi invece di aumento previsto fino a 300 - 400 query al secondo ho effettivamente un calo fino a 150. Per 10 processi ho solo 70 query al secondo Il codice Perl che sto utilizzando per il test è la seguente:

#!/usr/bin/perl 

use strict; 
use warnings; 

use DBI; 
use Parallel::Benchmark; 
use SQL::Abstract; 
use SQL::Abstract::Plugin::InsertMulti; 

my $children_dbh; 

foreach my $second_table_row_count (0, 1, 1000) { 
    print '#' x 80, "\nsecond_table_row_count = $second_table_row_count\n"; 
    create_and_fill_tables(1000, $second_table_row_count); 
    foreach my $concurrency (1, 2, 3, 4, 6, 8, 10, 20) { 
     my $bm = Parallel::Benchmark->new(
      'benchmark' => sub { 
       _run_sql(); 
       return 1; 
      }, 
      'concurrency' => $concurrency, 
      'time' => 3, 
     ); 
     my $result = $bm->run(); 
    } 
} 

sub create_and_fill_tables { 
    my ($first_table_row_count, $second_table_row_count) = @_; 
    my $dbh = dbi_connect(); 
    { 
     $dbh->do(q{DROP TABLE IF EXISTS first}); 
     $dbh->do(q{ 
      CREATE TABLE first (
       num int(10) NOT NULL, 
       UNIQUE KEY key_num (num) 
      ) ENGINE=InnoDB 
     }); 
     if ($first_table_row_count) { 
      my ($stmt, @bind) = SQL::Abstract->new()->insert_multi(
       'first', 
       ['num'], 
       [map {[$_]} 1 .. $first_table_row_count], 
      ); 
      $dbh->do($stmt, undef, @bind); 
     } 
    } 
    { 
     $dbh->do(q{DROP TABLE IF EXISTS second}); 
     $dbh->do(q{ 
      CREATE TABLE second (
       num int(10) NOT NULL, 
       num2 int(10) NOT NULL, 
       UNIQUE KEY key_num (num, num2) 
      ) ENGINE=InnoDB 
     }); 
     if ($second_table_row_count) { 
      my ($stmt, @bind) = SQL::Abstract->new()->insert_multi(
       'second', 
       ['num'], 
       [map {[$_]} 1 .. $second_table_row_count], 
      ); 
      $dbh->do($stmt, undef, @bind); 
     } 
    } 
} 

sub _run_sql { 
    $children_dbh ||= dbi_connect(); 
    $children_dbh->selectall_arrayref(q{ 
     SELECT first.num 
     FROM first 
     LEFT JOIN second AS second_1 ON second_1.num = -1 
     LEFT JOIN second AS second_2 ON second_2.num = -2 
     LEFT JOIN second AS second_3 ON second_3.num = -3 
     LEFT JOIN second AS second_4 ON second_4.num = -4 
     LEFT JOIN second AS second_5 ON second_5.num = -5 
     LEFT JOIN second AS second_6 ON second_6.num = -6 
     WHERE second_1.num IS NULL 
      AND second_2.num IS NULL 
      AND second_3.num IS NULL 
      AND second_4.num IS NULL 
      AND second_5.num IS NULL 
      AND second_6.num IS NULL 
    }); 
} 

sub dbi_connect { 
    return DBI->connect(
     'dbi:mysql:' 
      . 'database=tmp' 
      . ';host=localhost' 
      . ';port=3306', 
     'root', 
     '', 
    ); 
} 

E per confrontare le query di questo tipo eseguito in concomitanza con l'aumento delle prestazioni:

SELECT first.num 
FROM first 
LEFT JOIN second AS second_1 ON second_1.num = 1 # existent key 
LEFT JOIN second AS second_2 ON second_2.num = 2 # existent key 
LEFT JOIN second AS second_3 ON second_3.num = 3 # existent key 
LEFT JOIN second AS second_4 ON second_4.num = 4 # existent key 
LEFT JOIN second AS second_5 ON second_5.num = 5 # existent key 
LEFT JOIN second AS second_6 ON second_6.num = 6 # existent key 
WHERE second_1.num IS NOT NULL 
    AND second_2.num IS NOT NULL 
    AND second_3.num IS NOT NULL 
    AND second_4.num IS NOT NULL 
    AND second_5.num IS NOT NULL 
    AND second_6.num IS NOT NULL 

risultati dei test, misurazioni della CPU e di utilizzo del disco sono qui :

 
* table `first` have 1000 rows 
* table `second` have 6 rows: `[1,1],[2,2],..[6,6]` 

For query: 
    SELECT first.num 
    FROM first 
    LEFT JOIN second AS second_1 ON second_1.num = -1 # non-existent key 
    LEFT JOIN second AS second_2 ON second_2.num = -2 # non-existent key 
    LEFT JOIN second AS second_3 ON second_3.num = -3 # non-existent key 
    LEFT JOIN second AS second_4 ON second_4.num = -4 # non-existent key 
    LEFT JOIN second AS second_5 ON second_5.num = -5 # non-existent key 
    LEFT JOIN second AS second_6 ON second_6.num = -6 # non-existent key 
    WHERE second_1.num IS NULL 
     AND second_2.num IS NULL 
     AND second_3.num IS NULL 
     AND second_4.num IS NULL 
     AND second_5.num IS NULL 
     AND second_6.num IS NULL 

Results: 
    concurrency: 1,  speed: 162.910/sec 
    concurrency: 2,  speed: 137.818/sec 
    concurrency: 3,  speed: 130.728/sec 
    concurrency: 4,  speed: 107.387/sec 
    concurrency: 6,  speed: 90.513/sec 
    concurrency: 8,  speed: 80.445/sec 
    concurrency: 10, speed: 80.381/sec 
    concurrency: 20, speed: 84.069/sec 

System usage after for last 60 minutes of running query in 6 processes: 
    $ iostat -cdkx 60 

    avg-cpu: %user %nice %system %iowait %steal %idle 
       74.82 0.00 0.08 0.00 0.08 25.02 

    Device:   rrqm/s wrqm/s  r/s  w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util 
    sda1    0.00  0.00 0.00 0.12  0.00  0.80 13.71  0.00 1.43 1.43 0.02 
    sdf10    0.00  0.00 0.00 0.03  0.00  1.07 64.00  0.00 10.00 5.00 0.02 
    sdf4    0.00  0.00 0.00 0.03  0.00  1.07 64.00  0.00 30.00 15.00 0.05 
    sdm    0.00  0.00 0.00 0.00  0.00  0.00  0.00  0.00 0.00 0.00 0.00 
    sdf8    0.00  0.00 0.00 0.37  0.00  1.24  6.77  0.00 5.00 3.18 0.12 
    sdf6    0.00  0.00 0.00 0.03  0.00  1.07 64.00  0.00 10.00 5.00 0.02 
    sdf9    0.00  0.00 0.00 0.03  0.00  1.07 64.00  0.00 0.00 0.00 0.00 
    sdf    0.00  0.00 0.00 0.00  0.00  0.00  0.00  0.00 0.00 0.00 0.00 
    sdf3    0.00  0.00 0.00 0.08  0.00  1.33 32.00  0.00 4.00 4.00 0.03 
    sdf2    0.00  0.00 0.00 0.17  0.00  1.37 16.50  0.00 3.00 3.00 0.05 
    sdf15    0.00  0.00 0.00 0.00  0.00  0.00  0.00  0.00 0.00 0.00 0.00 
    sdf14    0.00  0.00 0.00 0.00  0.00  0.00  0.00  0.00 0.00 0.00 0.00 
    sdf1    0.00  0.00 0.00 0.05  0.00  0.40 16.00  0.00 0.00 0.00 0.00 
    sdf13    0.00  0.00 0.00 0.03  0.00  1.07 64.00  0.00 10.00 5.00 0.02 
    sdf5    0.00  0.00 0.00 0.03  0.00  1.07 64.00  0.00 50.00 25.00 0.08 
    sdm2    0.00  0.00 0.00 0.00  0.00  0.00  0.00  0.00 0.00 0.00 0.00 
    sdm1    0.00  0.00 0.00 0.00  0.00  0.00  0.00  0.00 0.00 0.00 0.00 
    sdf12    0.00  0.00 0.00 0.03  0.00  1.07 64.00  0.00 10.00 5.00 0.02 
    sdf11    0.00  0.00 0.00 0.03  0.00  1.07 64.00  0.00 10.00 5.00 0.02 
    sdf7    0.00  0.00 0.00 0.03  0.00  1.07 64.00  0.00 10.00 5.00 0.02 
    md0    0.00  0.00 0.00 0.97  0.00 13.95 28.86  0.00 0.00 0.00 0.00 

################################################################################ 

For query: 
    SELECT first.num 
    FROM first 
    LEFT JOIN second AS second_1 ON second_1.num = 1 # existent key 
    LEFT JOIN second AS second_2 ON second_2.num = 2 # existent key 
    LEFT JOIN second AS second_3 ON second_3.num = 3 # existent key 
    LEFT JOIN second AS second_4 ON second_4.num = 4 # existent key 
    LEFT JOIN second AS second_5 ON second_5.num = 5 # existent key 
    LEFT JOIN second AS second_6 ON second_6.num = 6 # existent key 
    WHERE second_1.num IS NOT NULL 
     AND second_2.num IS NOT NULL 
     AND second_3.num IS NOT NULL 
     AND second_4.num IS NOT NULL 
     AND second_5.num IS NOT NULL 
     AND second_6.num IS NOT NULL 

Results: 
    concurrency: 1,  speed: 875.973/sec 
    concurrency: 2,  speed: 944.986/sec 
    concurrency: 3,  speed: 1256.072/sec 
    concurrency: 4,  speed: 1401.657/sec 
    concurrency: 6,  speed: 1354.351/sec 
    concurrency: 8,  speed: 1110.100/sec 
    concurrency: 10, speed: 1145.251/sec 
    concurrency: 20, speed: 1142.514/sec 

System usage after for last 60 minutes of running query in 6 processes: 
    $ iostat -cdkx 60 

    avg-cpu: %user %nice %system %iowait %steal %idle 
       74.40 0.00 0.53 0.00 0.06 25.01 

    Device:   rrqm/s wrqm/s  r/s  w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util 
    sda1    0.00  0.00 0.00 0.02  0.00  0.13 16.00  0.00 0.00 0.00 0.00 
    sdf10    0.00  0.00 0.00 0.03  0.00  1.07 64.00  0.00 10.00 5.00 0.02 
    sdf4    0.00  0.00 0.00 0.03  0.00  1.07 64.00  0.00 10.00 5.00 0.02 
    sdm    0.00  0.00 0.00 0.00  0.00  0.00  0.00  0.00 0.00 0.00 0.00 
    sdf8    0.00  0.00 0.00 0.03  0.00  1.07 64.00  0.00 10.00 5.00 0.02 
    sdf6    0.00  0.00 0.00 0.03  0.00  1.07 64.00  0.00 0.00 0.00 0.00 
    sdf9    0.00  0.00 0.00 0.03  0.00  1.07 64.00  0.00 10.00 5.00 0.02 
    sdf    0.00  0.00 0.00 0.00  0.00  0.00  0.00  0.00 0.00 0.00 0.00 
    sdf3    0.00  0.00 0.00 0.13  0.00  2.67 40.00  0.00 3.75 2.50 0.03 
    sdf2    0.00  0.00 0.00 0.23  0.00  2.72 23.29  0.00 2.14 1.43 0.03 
    sdf15    0.00  0.00 0.00 0.00  0.00  0.00  0.00  0.00 0.00 0.00 0.00 
    sdf14    0.00  0.00 0.00 0.98  0.00  0.54  1.10  0.00 2.71 2.71 0.27 
    sdf1    0.00  0.00 0.00 0.08  0.00  1.47 35.20  0.00 8.00 6.00 0.05 
    sdf13    0.00  0.00 0.00 0.00  0.00  0.00  0.00  0.00 0.00 0.00 0.00 
    sdf5    0.00  0.00 0.00 0.03  0.00  1.07 64.00  0.00 10.00 5.00 0.02 
    sdm2    0.00  0.00 0.00 0.00  0.00  0.00  0.00  0.00 0.00 0.00 0.00 
    sdm1    0.00  0.00 0.00 0.00  0.00  0.00  0.00  0.00 0.00 0.00 0.00 
    sdf12    0.00  0.00 0.00 0.00  0.00  0.00  0.00  0.00 0.00 0.00 0.00 
    sdf11    0.00  0.00 0.00 0.03  0.00  1.07 64.00  0.00 0.00 0.00 0.00 
    sdf7    0.00  0.00 0.00 0.03  0.00  1.07 64.00  0.00 10.00 5.00 0.02 
    md0    0.00  0.00 0.00 1.70  0.00 15.92 18.74  0.00 0.00 0.00 0.00 

################################################################################ 

And this server has lots of free memory. Example of top: 
    top - 19:02:59 up 4:23, 4 users, load average: 4.43, 3.03, 2.01 
    Tasks: 218 total, 1 running, 217 sleeping, 0 stopped, 0 zombie 
    Cpu(s): 72.8%us, 0.7%sy, 0.0%ni, 26.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.1%st 
    Mem: 71701416k total, 22183980k used, 49517436k free,  284k buffers 
    Swap:  0k total,  0k used,  0k free, 1282768k cached 

     PID USER  PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 
    2506 mysql  20 0 51.7g 17g 5920 S 590 25.8 213:15.12 mysqld 
    9348 topadver 20 0 72256 11m 1428 S 2 0.0 0:01.45 perl 
    9349 topadver 20 0 72256 11m 1428 S 2 0.0 0:01.44 perl 
    9350 topadver 20 0 72256 11m 1428 S 2 0.0 0:01.45 perl 
    9351 topadver 20 0 72256 11m 1428 S 1 0.0 0:01.44 perl 
    9352 topadver 20 0 72256 11m 1428 S 1 0.0 0:01.44 perl 
    9353 topadver 20 0 72256 11m 1428 S 1 0.0 0:01.44 perl 
    9346 topadver 20 0 19340 1504 1064 R 0 0.0 0:01.89 top 

Qualcuno ha un'idea del perché le prestazioni sono diminuite per la query con chiavi inesistenti?

+0

perché si utilizza "dove num .... è nullo" se la tabella di creazione ha già una condizione NULL? – jcho360

+0

@ jcho360 i join di sinistra creeranno valori nulli del genere. Sembra una configurazione. Awayka, potresti fornire alcune informazioni sul tuo server MYSQL? Ha più processori? – Twelfth

+0

@Twelfth mysql Ver 14.14 Distrib 5.1.59, per debian-linux-gnu (x86_64) utilizzando readline 5.1 su m2.4xlarge istanza EC2 che ha 8 core – awayka

risposta

1

Suggerisco di provare un approccio in cui ogni fork utilizza la propria connessione (mi sembra che in questo momento $children_dbh, che detiene una connessione DB, è una variabile condivisa). Oppure, ancora meglio, implementare il cosiddetto connection pool, da cui ogni processo client effettuerà una connessione in un momento in cui è richiesto e lo restituirà quando non sarà più necessario.

Verificare this answer per ulteriori dettagli: il thread in cui è stato fornito riguarda Java, ma in realtà riguarda alcuni principi universali dell'organizzazione MySQL. E this answer potrebbe essere utile pure.

P.S. Una situazione piuttosto simile (credo) è descritta allo here e c'è una spiegazione dettagliata su come organizzare un pool di connessioni.

+0

Cosa farà questa stringa '$ children_dbh || = dbi_connect();' di '_run_sql()'? – raina77ow

+3

Sembra che non ci siano thread per me: thread singolo per ogni processo. Dove hai visto i fili? [Parallel :: Benchmark] (http://search.cpan.org/~fujiwara/Parallel-Benchmark-0.04/lib/Parallel/Benchmark.pm) utilizza [Parallel :: ForkManager] (http: //search.cpan. org/~ dlux/Parallel-ForkManager-0.7.9/lib/Parallel/ForkManager.pm) che forchetta. – nab

+0

Il mio punto è che c'è una connessione condivisa, il che spiega perché le prestazioni peggiorano effettivamente con ogni nuovo processo. E, lo ripeto, è abbastanza facile verificare quante connessioni al DB vengano effettivamente utilizzate.Non ha senso dire "sembra" e discutere sulla teoria: o si usa una singola connessione o no. – raina77ow

8

Una domanda ben scritta, che mostra qualche ricerca.

Per curiosità, ho provato MySQL 5.6 per vedere quali strumenti ci sono da dire su queste query.

In primo luogo, si noti che le query sono diversi:

  • modificando il valore da "1" a "-1" per l'esistente caso esistente/non chiave è una cosa
  • cambia "second_1. num IS NOT NULL "a " second_1.num IS NULL "nella clausola WHERE è un'altra.

Utilizzando SPIEGARE dà diversi piani:

EXPLAIN SELECT `first`.num 
FROM `first` 
LEFT JOIN `second` AS second_1 ON second_1.num = -1 # non-existent key 
LEFT JOIN `second` AS second_2 ON second_2.num = -2 # non-existent key 
LEFT JOIN `second` AS second_3 ON second_3.num = -3 # non-existent key 
LEFT JOIN `second` AS second_4 ON second_4.num = -4 # non-existent key 
LEFT JOIN `second` AS second_5 ON second_5.num = -5 # non-existent key 
LEFT JOIN `second` AS second_6 ON second_6.num = -6 # non-existent key 
WHERE second_1.num IS NULL 
AND second_2.num IS NULL 
AND second_3.num IS NULL 
AND second_4.num IS NULL 
AND second_5.num IS NULL 
AND second_6.num IS NULL 
; 
id  select_type  table type possible_keys key  key_len ref  rows Extra 
1  SIMPLE first index NULL key_num 4  NULL 1000 Using index 
1  SIMPLE second_1  ref  key_num key_num 4  const 1  Using where; Not exists; Using index 
1  SIMPLE second_2  ref  key_num key_num 4  const 1  Using where; Not exists; Using index 
1  SIMPLE second_3  ref  key_num key_num 4  const 1  Using where; Not exists; Using index 
1  SIMPLE second_4  ref  key_num key_num 4  const 1  Using where; Not exists; Using index 
1  SIMPLE second_5  ref  key_num key_num 4  const 1  Using where; Not exists; Using index 
1  SIMPLE second_6  ref  key_num key_num 4  const 1  Using where; Not exists; Using index 

al contrario di

EXPLAIN SELECT `first`.num 
FROM `first` 
LEFT JOIN `second` AS second_1 ON second_1.num = 1 # existent key 
LEFT JOIN `second` AS second_2 ON second_2.num = 2 # existent key 
LEFT JOIN `second` AS second_3 ON second_3.num = 3 # existent key 
LEFT JOIN `second` AS second_4 ON second_4.num = 4 # existent key 
LEFT JOIN `second` AS second_5 ON second_5.num = 5 # existent key 
LEFT JOIN `second` AS second_6 ON second_6.num = 6 # existent key 
WHERE second_1.num IS NOT NULL 
AND second_2.num IS NOT NULL 
AND second_3.num IS NOT NULL 
AND second_4.num IS NOT NULL 
AND second_5.num IS NOT NULL 
AND second_6.num IS NOT NULL 
; 
id  select_type  table type possible_keys key  key_len ref  rows Extra 
1  SIMPLE second_1  ref  key_num key_num 4  const 1  Using index 
1  SIMPLE second_2  ref  key_num key_num 4  const 1  Using index 
1  SIMPLE second_3  ref  key_num key_num 4  const 1  Using index 
1  SIMPLE second_4  ref  key_num key_num 4  const 1  Using index 
1  SIMPLE second_5  ref  key_num key_num 4  const 1  Using index 
1  SIMPLE second_6  ref  key_num key_num 4  const 1  Using index 
1  SIMPLE first index NULL key_num 4  NULL 1000 Using index; Using join buffer (Block Nested Loop) 

Utilizzando il formato JSON, abbiamo:

EXPLAIN FORMAT=JSON SELECT `first`.num 
FROM `first` 
LEFT JOIN `second` AS second_1 ON second_1.num = -1 # non-existent key 
LEFT JOIN `second` AS second_2 ON second_2.num = -2 # non-existent key 
LEFT JOIN `second` AS second_3 ON second_3.num = -3 # non-existent key 
LEFT JOIN `second` AS second_4 ON second_4.num = -4 # non-existent key 
LEFT JOIN `second` AS second_5 ON second_5.num = -5 # non-existent key 
LEFT JOIN `second` AS second_6 ON second_6.num = -6 # non-existent key 
WHERE second_1.num IS NULL 
AND second_2.num IS NULL 
AND second_3.num IS NULL 
AND second_4.num IS NULL 
AND second_5.num IS NULL 
AND second_6.num IS NULL 
; 
EXPLAIN 
{ 
    "query_block": { 
    "select_id": 1, 
    "nested_loop": [ 
     { 
     "table": { 
      "table_name": "first", 
      "access_type": "index", 
      "key": "key_num", 
      "key_length": "4", 
      "rows": 1000, 
      "filtered": 100, 
      "using_index": true 
     } 
     }, 
     { 
     "table": { 
      "table_name": "second_1", 
      "access_type": "ref", 
      "possible_keys": [ 
      "key_num" 
      ], 
      "key": "key_num", 
      "key_length": "4", 
      "ref": [ 
      "const" 
      ], 
      "rows": 1, 
      "filtered": 100, 
      "not_exists": true, 
      "using_index": true, 
      "attached_condition": "<if>(found_match(second_1), isnull(`test`.`second_1`.`num`), true)" 
     } 
     }, 
     { 
     "table": { 
      "table_name": "second_2", 
      "access_type": "ref", 
      "possible_keys": [ 
      "key_num" 
      ], 
      "key": "key_num", 
      "key_length": "4", 
      "ref": [ 
      "const" 
      ], 
      "rows": 1, 
      "filtered": 100, 
      "not_exists": true, 
      "using_index": true, 
      "attached_condition": "<if>(found_match(second_2), isnull(`test`.`second_2`.`num`), true)" 
     } 
     }, 
     { 
     "table": { 
      "table_name": "second_3", 
      "access_type": "ref", 
      "possible_keys": [ 
      "key_num" 
      ], 
      "key": "key_num", 
      "key_length": "4", 
      "ref": [ 
      "const" 
      ], 
      "rows": 1, 
      "filtered": 100, 
      "not_exists": true, 
      "using_index": true, 
      "attached_condition": "<if>(found_match(second_3), isnull(`test`.`second_3`.`num`), true)" 
     } 
     }, 
     { 
     "table": { 
      "table_name": "second_4", 
      "access_type": "ref", 
      "possible_keys": [ 
      "key_num" 
      ], 
      "key": "key_num", 
      "key_length": "4", 
      "ref": [ 
      "const" 
      ], 
      "rows": 1, 
      "filtered": 100, 
      "not_exists": true, 
      "using_index": true, 
      "attached_condition": "<if>(found_match(second_4), isnull(`test`.`second_4`.`num`), true)" 
     } 
     }, 
     { 
     "table": { 
      "table_name": "second_5", 
      "access_type": "ref", 
      "possible_keys": [ 
      "key_num" 
      ], 
      "key": "key_num", 
      "key_length": "4", 
      "ref": [ 
      "const" 
      ], 
      "rows": 1, 
      "filtered": 100, 
      "not_exists": true, 
      "using_index": true, 
      "attached_condition": "<if>(found_match(second_5), isnull(`test`.`second_5`.`num`), true)" 
     } 
     }, 
     { 
     "table": { 
      "table_name": "second_6", 
      "access_type": "ref", 
      "possible_keys": [ 
      "key_num" 
      ], 
      "key": "key_num", 
      "key_length": "4", 
      "ref": [ 
      "const" 
      ], 
      "rows": 1, 
      "filtered": 100, 
      "not_exists": true, 
      "using_index": true, 
      "attached_condition": "<if>(found_match(second_6), isnull(`test`.`second_6`.`num`), true)" 
     } 
     } 
    ] 
    } 
} 

al contrario di

EXPLAIN FORMAT=JSON SELECT `first`.num 
FROM `first` 
LEFT JOIN `second` AS second_1 ON second_1.num = 1 # existent key 
LEFT JOIN `second` AS second_2 ON second_2.num = 2 # existent key 
LEFT JOIN `second` AS second_3 ON second_3.num = 3 # existent key 
LEFT JOIN `second` AS second_4 ON second_4.num = 4 # existent key 
LEFT JOIN `second` AS second_5 ON second_5.num = 5 # existent key 
LEFT JOIN `second` AS second_6 ON second_6.num = 6 # existent key 
WHERE second_1.num IS NOT NULL 
AND second_2.num IS NOT NULL 
AND second_3.num IS NOT NULL 
AND second_4.num IS NOT NULL 
AND second_5.num IS NOT NULL 
AND second_6.num IS NOT NULL 
; 
EXPLAIN 
{ 
    "query_block": { 
    "select_id": 1, 
    "nested_loop": [ 
     { 
     "table": { 
      "table_name": "second_1", 
      "access_type": "ref", 
      "possible_keys": [ 
      "key_num" 
      ], 
      "key": "key_num", 
      "key_length": "4", 
      "ref": [ 
      "const" 
      ], 
      "rows": 1, 
      "filtered": 100, 
      "using_index": true 
     } 
     }, 
     { 
     "table": { 
      "table_name": "second_2", 
      "access_type": "ref", 
      "possible_keys": [ 
      "key_num" 
      ], 
      "key": "key_num", 
      "key_length": "4", 
      "ref": [ 
      "const" 
      ], 
      "rows": 1, 
      "filtered": 100, 
      "using_index": true 
     } 
     }, 
     { 
     "table": { 
      "table_name": "second_3", 
      "access_type": "ref", 
      "possible_keys": [ 
      "key_num" 
      ], 
      "key": "key_num", 
      "key_length": "4", 
      "ref": [ 
      "const" 
      ], 
      "rows": 1, 
      "filtered": 100, 
      "using_index": true 
     } 
     }, 
     { 
     "table": { 
      "table_name": "second_4", 
      "access_type": "ref", 
      "possible_keys": [ 
      "key_num" 
      ], 
      "key": "key_num", 
      "key_length": "4", 
      "ref": [ 
      "const" 
      ], 
      "rows": 1, 
      "filtered": 100, 
      "using_index": true 
     } 
     }, 
     { 
     "table": { 
      "table_name": "second_5", 
      "access_type": "ref", 
      "possible_keys": [ 
      "key_num" 
      ], 
      "key": "key_num", 
      "key_length": "4", 
      "ref": [ 
      "const" 
      ], 
      "rows": 1, 
      "filtered": 100, 
      "using_index": true 
     } 
     }, 
     { 
     "table": { 
      "table_name": "second_6", 
      "access_type": "ref", 
      "possible_keys": [ 
      "key_num" 
      ], 
      "key": "key_num", 
      "key_length": "4", 
      "ref": [ 
      "const" 
      ], 
      "rows": 1, 
      "filtered": 100, 
      "using_index": true 
     } 
     }, 
     { 
     "table": { 
      "table_name": "first", 
      "access_type": "index", 
      "key": "key_num", 
      "key_length": "4", 
      "rows": 1000, 
      "filtered": 100, 
      "using_index": true, 
      "using_join_buffer": "Block Nested Loop" 
     } 
     } 
    ] 
    } 
} 

Guardando la tabella io strumentato dallo schema prestazioni in fase di esecuzione, abbiamo:

truncate table performance_schema.objects_summary_global_by_type; 
select * from performance_schema.objects_summary_global_by_type 
where OBJECT_NAME in ("first", "second"); 
OBJECT_TYPE OBJECT_SCHEMA OBJECT_NAME COUNT_STAR SUM_TIMER_WAIT MIN_TIMER_WAIT AVG_TIMER_WAIT MAX_TIMER_WAIT 
TABLE test first 0 0 0 0 0 
TABLE test second 0 0 0 0 0 
SELECT `first`.num 
FROM `first` 
LEFT JOIN `second` AS second_1 ON second_1.num = -1 # non-existent key 
LEFT JOIN `second` AS second_2 ON second_2.num = -2 # non-existent key 
LEFT JOIN `second` AS second_3 ON second_3.num = -3 # non-existent key 
LEFT JOIN `second` AS second_4 ON second_4.num = -4 # non-existent key 
LEFT JOIN `second` AS second_5 ON second_5.num = -5 # non-existent key 
LEFT JOIN `second` AS second_6 ON second_6.num = -6 # non-existent key 
WHERE second_1.num IS NULL 
AND second_2.num IS NULL 
AND second_3.num IS NULL 
AND second_4.num IS NULL 
AND second_5.num IS NULL 
AND second_6.num IS NULL 
; 
(...) 
select * from performance_schema.objects_summary_global_by_type 
where OBJECT_NAME in ("first", "second"); 
OBJECT_TYPE OBJECT_SCHEMA OBJECT_NAME COUNT_STAR SUM_TIMER_WAIT MIN_TIMER_WAIT AVG_TIMER_WAIT MAX_TIMER_WAIT 
TABLE test first 1003 5705014442 1026171 5687889 87356557 
TABLE test second 6012 271786533972 537266 45207298 1123939292 

in contrapposizione a:

select * from performance_schema.objects_summary_global_by_type 
where OBJECT_NAME in ("first", "second"); 
OBJECT_TYPE OBJECT_SCHEMA OBJECT_NAME COUNT_STAR SUM_TIMER_WAIT MIN_TIMER_WAIT AVG_TIMER_WAIT MAX_TIMER_WAIT 
TABLE test first 1003 5211074603 969338 5195454 61066176 
TABLE test second 24 458656783 510085 19110361 66229860 

La query in grado di scalare fa quasi nessuna tabella IO nella tabella second. La query non in scala fa l'I/O tabella 6K nella tabella second o 6 volte la dimensione della tabella first.

Questo perché i piani di query sono diversi, a turno perché le query sono diverse (NON È NULL contro IS NULL).

Penso che risponda alla domanda relativa alle prestazioni.

Si noti che entrambe le query hanno restituito 1000 righe nei miei test, che potrebbero non essere ciò che si desidera. Prima di sintonizzare una query per renderla più veloce, assicurarsi che funzioni come previsto.

Problemi correlati