HDFS How to Remove Directory
Removing a directory in HDFS is easy. You can do this from the command line with any of the following commands:
hdfs dfs -rm -r hdfs://my-host1:50070/my-dir1 hadoop dfs -rm -r hdfs://my-host1:50070/my-dir1 hdfs dfs -rmdir hdfs://my-host1:50070/my-dir1
The key here is the use of the “-r” flag. This isn’t needed if you are deleting regular files. To delete a directory with the “-rm” command, you need to specify that the deletion is recursive. It is important to note that this will remove anything under the directory including subdirectories which does make a lot of sense. The “-rmdir” command is another option.
To learn more, check out our Hadoop HDFS commands guide HERE.
You can skip the trash if you use the “-skipTrash” option. Anything deleted will be deleted permanently. It works like this:
hdfs dfs -rm -r -skipTrash hdfs://my-host1:50070/my-dir1
For extra safety, you can ask to be prompted if the deletion will remove over a certain number of files ( hadoop.shell.delete.limit.num.files).
hdfs dfs -rm -r -safelyask hdfs://my-host1:50070/my-dir1
Removing A File From HDFS With Special Chars Like a Space or Comma
If you have special characters in your path, you can just escape them with a backslash like this:
hdfs dfs -rm -r hdfs://my-host1:50070/my-dir1/another\,\ subdir
The “-rmr” dfs command is deprecated. You may find references to this command in outdated guides and on forums but it has been deprecated. Apparently this has been the case since Hadoop 2.6.
hadoop dfs -rmr hdfs://my-host1:50070/my-dir1