Nell'odierno episodio di hbase mi sta riportando alla mia intelligenza abbiamo un problema in cui il master di hbase inizia e poi muore molto rapidamente. Il mio registro master è in questo modo:Hbase master continua a morire, afferma un hbase: lo spazio dei nomi esiste già
2014-06-20 12:52:40,469 FATAL [master:hdev01:60000] master.HMaster: Master serve
r abort: loaded coprocessors are: []
2014-06-20 12:52:40,470 FATAL [master:hdev01:60000] master.HMaster: Unhandled ex
ception. Starting shutdown.
org.apache.hadoop.hbase.TableExistsException: hbase:namespace
at org.apache.hadoop.hbase.master.handler.CreateTableHandler.prepare(Cre
ateTableHandler.java:120)
at org.apache.hadoop.hbase.master.TableNamespaceManager.createNamespaceT
able(TableNamespaceManager.java:232)
at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNames
paceManager.java:86)
at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:106
2)
at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.j
ava:926)
at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:615)
at java.lang.Thread.run(Thread.java:662)
2014-06-20 12:52:40,473 INFO [master:hdev01:60000] master.HMaster: Aborting
2014-06-20 12:52:40,473 DEBUG [master:hdev01:60000] master.HMaster: Stopping ser
vice threads
2014-06-20 12:52:40,473 INFO [master:hdev01:60000] ipc.RpcServer: Stopping serv
er on 60000
2014-06-20 12:52:40,473 INFO [CatalogJanitor-hdev01:60000] master.CatalogJanito
r: CatalogJanitor-hdev01:60000 exiting
2014-06-20 12:52:40,473 INFO [hdev01,60000,1403283149823-BalancerChore] balance
r.BalancerChore: hdev01,60000,1403283149823-BalancerChore exiting
2014-06-20 12:52:40,474 INFO [RpcServer.listener,port=60000] ipc.RpcServer: Rpc
Server.listener,port=60000: stopping
2014-06-20 12:52:40,474 INFO [RpcServer.responder] ipc.RpcServer: RpcServer.res
ponder: stopped
2014-06-20 12:52:40,474 INFO [master:hdev01:60000] master.HMaster: Stopping inf
oServer
2014-06-20 12:52:40,474 INFO [RpcServer.responder] ipc.RpcServer: RpcServer.res
ponder: stopping
2014-06-20 12:52:40,474 INFO [master:hdev01:60000.oldLogCleaner] cleaner.LogCle
aner: master:hdev01:60000.oldLogCleaner exiting
2014-06-20 12:52:40,475 INFO [hdev01,60000,1403283149823-ClusterStatusChore] ba
lancer.ClusterStatusChore: hdev01,60000,1403283149823-ClusterStatusChore exiting
2014-06-20 12:52:40,476 INFO [master:hdev01:60000.oldLogCleaner] master.Replica
tionLogCleaner: Stopping replicationLogCleaner-0x246ba2ab1e4001c, quorum=hdev02:
5181,hdev01:5181,hdev03:5181, baseZNode=/hbase
2014-06-20 12:52:40,479 INFO [master:hdev01:60000] mortbay.log: Stopped SelectC
[email protected]:16010
2014-06-20 12:52:40,478 INFO [master:hdev01:60000.archivedHFileCleaner] cleaner
.HFileCleaner: master:hdev01:60000.archivedHFileCleaner exiting
2014-06-20 12:52:40,483 INFO [master:hdev01:60000.oldLogCleaner] zookeeper.ZooK
eeper: Session: 0x246ba2ab1e4001c closed
2014-06-20 12:52:40,484 INFO [master:hdev01:60000-EventThread] zookeeper.Client
Cnxn: EventThread shut down
2014-06-20 12:52:40,589 DEBUG [master:hdev01:60000] catalog.CatalogTracker: Stop
ping catalog tracker [email protected]
2014-06-20 12:52:40,591 INFO [master:hdev01:60000] client.HConnectionManager$HC
onnectionImplementation: Closing zookeeper sessionid=0x246ba2ab1e4001b
2014-06-20 12:52:40,592 INFO [master:hdev01:60000] zookeeper.ZooKeeper: Session
: 0x246ba2ab1e4001b closed
2014-06-20 12:52:40,592 INFO [master:hdev01:60000-EventThread] zookeeper.Client
Cnxn: EventThread shut down
2014-06-20 12:52:40,695 INFO [hdev01,60000,1403283149823.splitLogManagerTimeout
Monitor] master.SplitLogManager$TimeoutMonitor: hdev01,60000,1403283149823.split
LogManagerTimeoutMonitor exiting
2014-06-20 12:52:40,696 INFO [master:hdev01:60000] zookeeper.ZooKeeper: Session
: 0x246ba2ab1e4001a closed
2014-06-20 12:52:40,696 INFO [main-EventThread] zookeeper.ClientCnxn: EventThre
ad shut down
2014-06-20 12:52:40,696 INFO [master:hdev01:60000] master.HMaster: HMaster main
thread exiting
2014-06-20 12:52:40,697 ERROR [main] master.HMasterCommandLine: Master exiting
java.lang.RuntimeException: HMaster Aborted
at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMaster
CommandLine.java:194)
at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandL
ine.java:135)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLi
ne.java:126)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2803)
ho pensato che questo potrebbe essere qualche residuo di una vecchia corsa così ho cancellato i file nella directory hbases dei dati, la directory zookeepers dati e le mie HDFS. Ho ancora lo stesso errore. Stranamente il mio popper HMaster torna temporaneamente indietro quando ho eseguito stop-hbase.sh anche se non c'era molto che potessi fare.
La mia versione di Hbase è 98,3 e il mio hadoop è 2.2.0. Il mio HBase-site.comf è
<configuration>
<property>
<name>hbase.master</name>
<value>hdev01:60000</value>
<description>The host and port that the HBase master runs at.
A value of 'local' runs the master and a regionserver
in a single process.
</description>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://hdev01:9000/hbase</value>
<description>The directory shared by region servers.</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description>The mode the cluster will be in. Possible values are
false: standalone and pseudo-distributed setups with managed
Zookeeper true: fully-distributed with unmanaged Zookeeper
Quorum (see hbase-env.sh)
</description>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>5181</value>
<description>Property from ZooKeeper's config zoo.cfg.
The port at which the clients will connect.
</description>
</property>
<property>
<name>zookeeper.session.timeout</name>
<value>10000</value>
<description></description>
</property>
<property>
<name>hbase.client.retries.number</name>
<value>10</value>
<description></description>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>hdev01,hdev02,hdev03</value>
<description>Comma separated list of servers in the ZooKeeper Quorum. For example, "host1.mydomain.com,host2.mydomain.com". By default this is set to localhost for local and pseudo-distributed modes of operation. For a fully-distributed setup, this should be set to a full list of ZooKeeper quorum servers. If
HBASE_MANAGES_ZK is set in hbase-env.sh
this is the list of servers which we will start/stop
ZooKeeper on.
</description>
</property>
</configuration>
EDIT HBase Tentativo org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair, il mio errore è ora HBase file layout needs to be upgraded. You have version null and I want version 8. Is your hbase.rootdir valid? If so, you may need to run 'hbase hbck -fixVersionFile'
Che è inutile dal momento che senza padrone hbck non verrà effettivamente eseguito . Modifica modificata Ho imbucato e riavviato il mio dfs e poi ho provato a riparare e riavviare le cose, ora sono tornato dove ho iniziato.
Questo ha cambiato il mio errore almeno. Ora mi dice ATTENZIONE! Il layout del file HBase deve essere aggiornato. Hai versione null e voglio la versione 8. Il tuo hbase.rootdir è valido? In tal caso, potrebbe essere necessario eseguire "hbase hbck -fixVersionFile". che è inutile dal momento che mi manca un master per eseguire hbck su – chenab
E dopo bombardare i miei DFS e quindi eseguire nuovamente lo strumento in linea di riparazione sono tornato dove ho iniziato – chenab
Ebbene si può anche accedere per Zookeeper e rimuovere il dir/HBase (HBase zkcli e rmr la directory/hbase - fai attenzione a non cancellare altro) –