Low Orbit Flux Logo 2 F

HDFS How to List Files

You can list files in HDFS with either of the following commands:

hdfs dfs -ls hdfs://my-host1:500
hadoop fs -ls hdfs://my-host1:500

This is a very common Hadoop/HDFS command. It is very similar to the common Unix/Linux “ls” command. It is a bit different but should be pretty easy to use no matter what you are familiar with.

Here is the command usage

Usage: hadoop fs -ls [-d] [-h] [-R] [-t] [-S] [-r] [-u] <args>
Options:
-d: Directories are listed as plain files.
-h: Format file sizes in a human-readable fashion (eg 64.0m instead of 67108864).
-R: Recursively list subdirectories encountered.
-t: Sort output by modification time (most recent first).
-S: Sort output by file size.
-r: Reverse the sort order.
-u: Use access time rather than modification time for display and sorting.

If you wanted to list all files recursively and sorted by timestamp, you would use the following:

hdfs dfs -ls hdfs://my-host1:500

You can sort files by output size like this:

hdfs dfs -ls -S hdfs://my-host1:500

This is useful if you are searching for files that are taking up a large amount of space.

The “-h” option is really handy option. It produces user readable sizes. For example, file sizes will be displayed in M or G rather than just the total number of bytes.

hadoop fs -ls -h hdfs://my-host1:500