Ho bisogno di ridurre la dimensione del database MySQL. Ho ricodificato alcune informazioni che hanno barrato ';' e ':' dalla colonna sources
(riduzione del 10% circa). Dopo averlo fatto, la dimensione della tabella è esattamente la stessa di prima. Come è possibile? Sto usando il motore MyISAM.Perché la dimensione della tabella MyISAM MySQL è la stessa dopo aver rimosso alcuni dati dalla colonna VARCHAR?
btw: Sfortunatamente, non riesco a comprimere le tabelle con myisampack
.
mysql> INSERT INTO test SELECT protid1, protid2, CS, REPLACE(REPLACE(sources, ':', ''), ';', '') FROM homologs_9606;
Query OK, 41917131 rows affected (4 min 11.30 sec)
Records: 41917131 Duplicates: 0 Warnings: 0
mysql> select TABLE_NAME name, ROUND(TABLE_ROWS/1e6, 3) 'million rows', ROUND(DATA_LENGTH/power(2,30), 3) 'data GB', ROUND(INDEX_LENGTH/power(2,30), 3) 'index GB' from information_schema.TABLES WHERE TABLE_NAME IN ('homologs_9606', 'test') ORDER BY TABLE_ROWS DESC LIMIT 10;
+---------------+--------------+---------+----------+
| name | million rows | data GB | index GB |
+---------------+--------------+---------+----------+
| test | 41.917 | 0.857 | 1.075 |
| homologs_9606 | 41.917 | 0.887 | 1.075 |
+---------------+--------------+---------+----------+
2 rows in set (0.01 sec)
mysql> select * from homologs_9606 limit 10;
+---------+---------+-------+--------------------------------+
| protid1 | protid2 | CS | sources |
+---------+---------+-------+--------------------------------+
| 5635338 | 1028608 | 0.000 | 10:,1 |
| 5644385 | 1028611 | 0.947 | 5:1,1;8:0.943,35;10:1,1;11:1,1 |
| 5652325 | 1028611 | 0.947 | 5:1,1;8:0.943,35;10:1,1;11:1,1 |
| 5641128 | 1028612 | 1.000 | 8:1,10 |
| 5636414 | 1028616 | 0.038 | 8:0.038,104;10:,1 |
| 5636557 | 1028616 | 0.000 | 8:,4 |
| 5637419 | 1028616 | 0.011 | 5:,1;8:0.011,91;10:,1 |
| 5641196 | 1028616 | 0.080 | 5:1,1;8:0.074,94;10:,1;11:,4 |
| 5642914 | 1028616 | 0.000 | 8:,3 |
| 5643778 | 1028616 | 0.056 | 8:0.057,70;10:,1 |
+---------+---------+-------+--------------------------------+
10 rows in set (4.55 sec)
mysql> select * from test limit 10;
+---------+---------+-------+-------------------------+
| protid1 | protid2 | CS | sources |
+---------+---------+-------+-------------------------+
| 5635338 | 1028608 | 0.000 | 10,1 |
| 5644385 | 1028611 | 0.947 | 51,180.943,35101,1111,1 |
| 5652325 | 1028611 | 0.947 | 51,180.943,35101,1111,1 |
| 5641128 | 1028612 | 1.000 | 81,10 |
| 5636414 | 1028616 | 0.038 | 80.038,10410,1 |
| 5636557 | 1028616 | 0.000 | 8,4 |
| 5637419 | 1028616 | 0.011 | 5,180.011,9110,1 |
| 5641196 | 1028616 | 0.080 | 51,180.074,9410,111,4 |
| 5642914 | 1028616 | 0.000 | 8,3 |
| 5643778 | 1028616 | 0.056 | 80.057,7010,1 |
+---------+---------+-------+-------------------------+
10 rows in set (0.00 sec)
mysql> describe test;
+---------+------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------+------------------+------+-----+---------+-------+
| protid1 | int(10) unsigned | YES | PRI | NULL | |
| protid2 | int(10) unsigned | YES | PRI | NULL | |
| CS | float(4,3) | YES | | NULL | |
| sources | varchar(100) | YES | | NULL | |
+---------+------------------+------+-----+---------+-------+
4 rows in set (0.00 sec)
mysql> describe homologs_9606;
+---------+------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------+------------------+------+-----+---------+-------+
| protid1 | int(10) unsigned | NO | PRI | 0 | |
| protid2 | int(10) unsigned | NO | PRI | 0 | |
| CS | float(4,3) | YES | | NULL | |
| sources | varchar(100) | YES | | NULL | |
+---------+------------------+------+-----+---------+-------+
4 rows in set (0.00 sec)
MODIFICA1: aggiunta lunghezza colonna media.
mysql> select AVG(LENGTH(sources)) from test;
+----------------------+
| AVG(LENGTH(sources)) |
+----------------------+
| 5.2177 |
+----------------------+
1 row in set (10.04 sec)
mysql> select AVG(LENGTH(sources)) from homologs_9606;
+----------------------+
| AVG(LENGTH(sources)) |
+----------------------+
| 6.8792 |
+----------------------+
1 row in set (9.95 sec)
EDIT2: ho potuto mettere a nudo qualche altra MB impostando NOT NULL
a tutte le colonne.
mysql> drop table test
Query OK, 0 rows affected (0.42 sec)
mysql> CREATE table test (protid1 INT UNSIGNED NOT NULL DEFAULT '0', protid2 INT UNSIGNED NOT NULL DEFAULT '0', CS FLOAT(4,3) NOT NULL DEFAULT '0', sources VARCHAR(100) NOT NULL DEFAULT '0', PRIMARY KEY (protid1, protid2), KEY `idx_protid2` (protid2)) ENGINE=MyISAM CHARSET=ascii;
Query OK, 0 rows affected (0.06 sec)
mysql> INSERT INTO test SELECT protid1, protid2, CS, REPLACE(REPLACE(sources, ':', ''), ';', '') FROM homologs_9606;
Query OK, 41917131 rows affected (2 min 7.84 sec)
mysql> select TABLE_NAME name, ROUND(TABLE_ROWS/1e6, 3) 'million rows', ROUND(DATA_LENGTH/power(2,30), 3) 'data GB', ROUND(INDEX_LENGTH/power(2,30), 3) 'index GB' from information_schema.TABLES WHERE TABLE_NAME IN ('homologs_9606', 'test');
Records: 41917131 Duplicates: 0 Warnings: 0
+---------------+--------------+---------+----------+
| name | million rows | data GB | index GB |
+---------------+--------------+---------+----------+
| homologs_9606 | 41.917 | 0.887 | 1.075 |
| test | 41.917 | 0.842 | 1.075 |
+---------------+--------------+---------+----------+
2 rows in set (0.02 sec)
Avete eseguito 'OTTIMIZZA TABELLA'? – Jaco
sì, la dimensione è la stessa. – Leszek
È possibile includere AVG (LENGTH (fonti)) per entrambe le tabelle? E anche il set di caratteri per ogni tabella ('SHOW CREATE TABLE tablename' è migliore di' DESCRIBE tablename') –