8

Sto provando ad usare MlLib per il mio filtraggio colloborativo.Apache Spark - MlLib - Filtro collaborativo

Ho riscontrato il seguente errore nel mio programma Scala quando lo eseguo in Apache Spark 1.0.0.

14/07/15 16:16:31 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
    14/07/15 16:16:31 WARN LoadSnappy: Snappy native library not loaded 
    14/07/15 16:16:31 INFO FileInputFormat: Total input paths to process : 1 
    14/07/15 16:16:38 WARN TaskSetManager: Lost TID 10 (task 80.0:0) 
    14/07/15 16:16:38 WARN TaskSetManager: Loss was due to java.lang.UnsatisfiedLinkError 
    java.lang.UnsatisfiedLinkError: org.jblas.NativeBlas.dposv(CII[DII[DII)I 
     at org.jblas.NativeBlas.dposv(Native Method) 
     at org.jblas.SimpleBlas.posv(SimpleBlas.java:369) 
     at org.jblas.Solve.solvePositive(Solve.java:68) 
     at org.apache.spark.mllib.recommendation.ALS$$anonfun$org$apache$spark$mllib$recommendation$ALS$$updateBlock$2.apply(ALS.scala:522) 
     at org.apache.spark.mllib.recommendation.ALS$$anonfun$org$apache$spark$mllib$recommendation$ALS$$updateBlock$2.apply(ALS.scala:509) 
     at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) 
     at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) 
     at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) 
     at scala.collection.mutable.ArrayOps$ofInt.foreach(ArrayOps.scala:156) 
     at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) 
     at scala.collection.mutable.ArrayOps$ofInt.map(ArrayOps.scala:156) 
     at org.apache.spark.mllib.recommendation.ALS.org$apache$spark$mllib$recommendation$ALS$$updateBlock(ALS.scala:509) 
     at org.apache.spark.mllib.recommendation.ALS$$anonfun$org$apache$spark$mllib$recommendation$ALS$$updateFeatures$2.apply(ALS.scala:445) 
     at org.apache.spark.mllib.recommendation.ALS$$anonfun$org$apache$spark$mllib$recommendation$ALS$$updateFeatures$2.apply(ALS.scala:444) 
     at org.apache.spark.rdd.MappedValuesRDD$$anonfun$compute$1.apply(MappedValuesRDD.scala:31) 
     at org.apache.spark.rdd.MappedValuesRDD$$anonfun$compute$1.apply(MappedValuesRDD.scala:31) 
     at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) 
     at org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$4.apply(CoGroupedRDD.scala:156) 
     at org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$4.apply(CoGroupedRDD.scala:154) 
     at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) 
     at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) 
     at org.apache.spark.rdd.CoGroupedRDD.compute(CoGroupedRDD.scala:154) 
     at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) 
     at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) 
     at org.apache.spark.rdd.MappedValuesRDD.compute(MappedValuesRDD.scala:31) 
     at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) 
     at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) 
     at org.apache.spark.rdd.FlatMappedValuesRDD.compute(FlatMappedValuesRDD.scala:31) 
     at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) 
     at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) 
     at org.apache.spark.rdd.FlatMappedRDD.compute(FlatMappedRDD.scala:33) 
     at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) 
     at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) 
     at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:158) 
     at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99) 
     at org.apache.spark.scheduler.Task.run(Task.scala:51) 
     at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187) 
     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
     at java.lang.Thread.run(Thread.java:744) 
    14/07/15 16:16:38 ERROR TaskSchedulerImpl: Lost executor 0 on maroki.office.mkechinov.ru: Uncaught exception 
    14/07/15 16:16:38 WARN TaskSetManager: Lost TID 12 (task 80.0:0) 
    14/07/15 16:16:42 WARN TaskSetManager: Lost TID 18 (task 80.0:1) 
    14/07/15 16:16:42 WARN TaskSetManager: Loss was due to fetch failure from null 
    14/07/15 16:16:42 WARN TaskSetManager: Loss was due to fetch failure from null 
    14/07/15 16:16:43 WARN TaskSetManager: Lost TID 25 (task 80.1:0) 
    14/07/15 16:16:43 WARN TaskSetManager: Loss was due to java.lang.UnsatisfiedLinkError 

Come posso risolvere questo errore?

risposta

9

Spark documentation indica chiaramente che MLLib utilizza librerie native, che devono essere presenti sui nodi. (Cioè non viene con l'installazione scintilla)

MLlib utilizza la libreria jblas algebra lineare, che a sua volta dipende dalla routine Fortran nativo. Potrebbe essere necessario installare la libreria di runtime gfortran se non è già presente sui nodi. MLlib genererà un errore di collegamento se non è in grado di rilevare automaticamente queste librerie.

si deve fare in modo che libgfortran biblioteca esiste su tutti i nodi.

per l'uso Debian/Ubuntu: sudo apt-get install libgfortran3

per CentOS uso: sudo yum install gcc-gfortran

Problemi correlati