Skip to content
This repository was archived by the owner on Jul 16, 2020. It is now read-only.
This repository was archived by the owner on Jul 16, 2020. It is now read-only.

"Frame length should be positive" problem in XGBoost with CPU (Mortgage-large) #71

Description

@peizhaoliu

dear author,

I came across this article "https://github.com/rapidsai/spark-examples/blob/master/getting-started-guides/on-prem-cluster/standalone-scala.md".
When i launch distributed training without GPUs (tree method hist), the parameters setting by following: "--num-executors 1 --executor-cores 19 --conf spark.cores.max=19 --conf spark.task.cpus=1 --class ai.rapids.spark.examples.mortgage.CPUMain -numWorkers=19 -treeMethod=hist"
However, tasks of the stage "foreachPartition at XGBoost.scala:703" always blocked in "running". In a few hours after submitted the job, we obtained the feeback:
java.lang.IllegalArgumentException: Frame length should be positive: -9223371863126827765 at org.spark_project.guava.base.Preconditions.checkArgument(Preconditions.java:119) at org.apache.spark.network.util.TransportFrameDecoder.decodeNext(TransportFrameDecoder.java:134) at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:81) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:138) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138) at java.lang.Thread.run(Thread.java:748)

Could you please come up some tips about this issue? Thanks

sincerely

Metadata

Metadata

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions