念念不忘
必有回响🎉

解决执行HiveSQL时,报could not be cleaned up的错误

在执行HiveSQL时,出现了如下的错误:

2018-11-05 16:43:02,568 ERROR org.apache.hadoop.hive.ql.exec.Task: [HiveServer2-Background-Pool: Thread-30333]: Failed with exception Directory hdfs://master:8020/user/hive/bus_optimation_xm/g_operate_stat
istic/date_time=201806124/date_type=1440/date_flag=1/index_id=100201001 could not be cleaned up.
org.apache.hadoop.hive.ql.metadata.HiveException: Directory hdfs://master:8020/user/hive/bus_optimation_xm/g_operate_statistic/date_time=201806124/date_type=1440/date_flag=1/index_id=100201001 could not be
 cleaned up.
        at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:2936)
        at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1442)
        at org.apache.hadoop.hive.ql.metadata.Hive.loadDynamicPartitions(Hive.java:1636)
        at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:388)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1782)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1539)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1318)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1127)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
        at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:178)
        at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:72)
        at org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:232)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
        at org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:245)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)                                                                                                         
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)                                                                                                 
        at java.lang.Thread.run(Thread.java:745)

已经下面这段错误:

Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied by sticky bit setting: user=bus, inode=000000_0
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkStickyBit(DefaultAuthorizationProvider.java:388)
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:166)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:152)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6621)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4078)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4030)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4014)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:841)
        at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:308)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:597)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)

        at org.apache.hadoop.ipc.Client.call(Client.java:1471)
        at org.apache.hadoop.ipc.Client.call(Client.java:1408)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
        at com.sun.proxy.$Proxy14.delete(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:531)
        at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
        at com.sun.proxy.$Proxy15.delete(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:2038)
        ... 30 more

无法清除目录大约就是权限的问题,可是即使我很暴力的加了777权限依旧无法删除该分区目录。在继续追查日志的时候发现如下信息:

Permission denied by sticky bit setting: user=bus, inode=000000_0

这算是错误日志中唯一的和权限有关系的报错提示,于是Google大法了一把,了解到原来是由于sticky bit导致的错误,那么什么是sticky bit呢?

sticky bit
不同于suid, guid,对于others的execute权限位,则可以设置sticky bit标志,
用t来表示,如果该位置本来就有可执行权限位,即x,则t和x叠加后用大写的T来表示。

sticky bit只对目录起作用,如果一个目录设置了sticky bit,则该目录下的文件只能被
该文件的owner或者root删除,其他用户即使有删除权限也无法删除该文件。

例如,/tmp目录,它的权限为d rwx rwx rwt,该目录中的文件(或目录)只能被owner
或root删除,这样大家都可以把自己的临时文件往该目录里面放,但是你的文件别人是无法
删除的。

注意:suid, sgid只对文件起作用,而sticky bit只对目录起作用。

来看下配置了sticky bit的效果:

-rwxrwxrwt   3 hdfs hive        636 2018-11-06 10:19 /user/hive/xxxx/000000_0
-rwxrwxrwt   3 hdfs hive        635 2018-11-06 10:19 /user/hive/xxxx/000001_0

原来如此,由于我们的Hive有时候会通过控制台访问,有时候会通过HUE访问,目前统计了下发现大约有5个用户在操作hive的表。所以如果是hadoop用户创建的文件,那么一旦配置了sticky bit,那么别的用户将无法删除这个文件。根据官方CDH官方文档,配置sticky bit的方法是如下操作

hadoop fs -chmod -R 1777 /user/hive/xxxx/

所以,解决办法就是将sticky bit配置给移除。

hadoop fs -chmod -R -t /user/hive/xxxx/

目前我试了下只能通过-t这个参数取消sticky bit的配置,了解linux chmod命令的同志们应该都清楚啥意思。无法通过chmod 777来取消这个参数。

参考文档:https://www.cloudera.com/documentation/enterprise/5-13-x/topics/cdh_sg_sticky_bit_set.html

赞(0) 打赏
未经允许不得转载:Charles's Blog » 解决执行HiveSQL时,报could not be cleaned up的错误

评论 抢沙发

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏