- A web server.
- PHP 4+ CLI (command-line interface).
Note: It is not required for the web server to be configured to execute PHP scripts.
Quick Start ↑Top
See https://github.com/amekkawi/diskusagereports#quick-start for a quick start guide.
- Download the latest version of Disk Usage Reports.
- Unzip the files into your Web server's public directory.
- If your web server executes PHP scripts, you must either secure the 'scripts' directory so it is not publicly accessible, or move the 'scripts' directory to a location on your server that is not publicly accessible.
Overview of Generating Reports ↑Top
There are two steps to generating the reports:
- Create a list of directories and files that are in the directory on which you wish to report.
- Process the list to generate the report files.
Step 1: Creating the List of Directories and Files ↑Top
For Linux, Mac OS X, and BSD systems, the fastest way is to use the bash script
scripts/find.sh, which uses the GNU find command. It has the following syntax:
Syntax: find.sh [-b|-ne] [-d <char|'null'>] [-] <directory-to-scan> [<find-test>, ...] Arguments: -b Force the usage of the 'ls' command's -b argument to escape unusual characters (e.g. a newline) in file names. Use this flag if you know that 'ls' supports this argument on your system and you want to skip the use of 'mktemp' to check for support. -d <char|'null'> Optionally specify the field delimiter for each line in the output. Must be a single ASCII character or the word 'null' for the null character. The default is the space character. -ne Force the script to execute even if the 'ls' command does not support the --escape or -b arguments. This will cause problems if file names encountered during the scan contain newlines. - (minus sign) If the <directory-to-scan> is the same as one of the options for this script (e.g. '-d'), you must use a minus sign as an argument before it. You should do this if you ever expect the <directory-to-scan> to start with a minus sign. <directory-to-scan> The directory that the list of sub-directories and files will be created for. <find-test> Optionally specify one or more tests that will be passed directly to the 'find' command. You must use the absolute path for any tests that match the path, such as '-path'. Do not use any expressions that would change the output of find, such as '-ls'. If using '-type', make sure that you do not exclude directories. See the 'find' man page for details. Expression Examples: ! -name '.DS_Store' -a ! -name 'Thumbs.db' Exclude extra files created by Windows and Mac OS. ! -size 0c Exclude files that have a size of zero bytes. ! -path '/var/www/html/somesite/*' Exclude the contents of a directory from the results. ! -path '/var/www/html/somesite' -a ! -path '/var/www/html/somesite/*' Completely exclude a directory from the results. -type d -a -type f Only include directories and regular files.
For Windows systems, the fastest way is to use the
scripts/find.exe command. It has the following syntax:
Syntax: find.exe [OPTIONS] <directory-to-scan> <directory-to-scan> The directory that the list of sub-directories and files will be created for. The OPTIONS are: -d <delim> The field delimiter for each line in the output. The default is the NULL character. -ds <directoryseparator> The directory separator used between directory names. The default is the directory separator for the operating system.
For other systems, you may use scripts/find.php. It has the following syntax:
Syntax: php find.php [-d <char|'null'>] [-ds <char>] [--force32bit] [-] <directory-to-scan> Arguments: -d <char|'null'> Optionally specify the field delimiter for each line in the output. Must be a single ASCII character or the word 'null' for the null character. The default is the space character. -ds <directoryseparator> Optionally specify the directory separator used between directory names. The default is the directory separator for the operating system. --force32bit Force the script to execute on 32-bit versions of PHP. This may lead to incorrect totals if find.php encounters files over 2 GB. - (hyphen) If the <directory-to-scan> is the same as one of the arguments for this script (e.g. '-d'), you must use a minus sign as an argument before it. You should do this if you ever expect the <directory-to-scan> to start with a minus sign. <directory-to-scan> The directory that the list of sub-directories and files will be created for.
All the above scripts will output to STDOUT.
Here are some examples of their usage:
bash scripts/find.sh path/to/directory > list-of-files.dat
scripts\find.exe path\to\directory > list-of-files.dat
php scripts/find.php path/to/directory > list-of-files.dat
Here are some examples of the OPTIONS:
bash scripts/find.sh -d " " path/do/directory > list-of-files.dat
Use a space as the field delimiter in the output. This is useful if you want to visually inspect the output, since the default NULL delimiter does not display on the console.
scripts\find.exe -ds / path\to\directory > list-of-files.dat
If the files are on a Windows server but you will be processing the report on a Linux computer, you must force the script to use a forward slash as a directory separator using
Step 2: Processing the List and Generating the Report ↑Top
Processing the output of the "find" scripts is done by the PHP script
scripts/process.php. It has the following syntax:
Syntax: php process.php [OPTIONS] <report-directory> [<filelist>] <report-directory> The directory where the report files will be saved. This should point to a directory under the 'data' directory. Examples: /var/www/html/diskusage/data/myreport C:\Inetpub\wwwroot\diskusage\data\myreport <filelist> The file that was created using one of the 'find' scripts (e.g. find.php). If you ommit this, process.php will attempt to read the file list from STDIN. The OPTIONS are: - (hyphen) If the <report-directory> or <filelist> are the same as one of the OPTIONS for this script (e.g. "-d"), you must use a minus sign as an argument before it. You should do this if you ever expect the <directory-to-scan> to start with a minus sign. -d <delim> The field delimiter that each line of the filelist will be split using. The default is the NULL character. Will be ignored if <filelist> has a header line (see notes). -ds <directoryseparator> Specify the directory separator used in the file list. This is useful if the list from step 1 was generated on a different operating system which uses a different directory separator. For example, Windows uses a backslash (\) while Linux/BSD/Mac/etc systems use a forward slash (/). The default is the directory separator for the operating system processing the report. Will be ignored if <filelist> has a header line (see notes). -fp Display the full path of the directories in the report. This is off by default since it could potentially pose a security risk. -l <num> Lines in the report that are longer than <num> will not be processed. This is just a failsafe to prevent the script from processing a list file that is not formatted properly. The default is 1024. -mt <bytes> The maximum number of bytes that the 'directory tree' file can be. The default is 819200. If the 'directory tree' file gets larger than this number, then the script will act as if -nt had been specified. -n <reportname> This text will display in the header of the report. -nt Disable the directory tree that appears on the left side of the report. -q Do not output any text to STDOUT. The script will return a non-zero if it fails. -ss <seconds> The minimum number of seconds that must elapse before another status message (e.g. 'Read X bytes, processed X lines...') is outputted. Default is 15 seconds. -su <suffix> Set the suffix of report files. This is '.txt' by default. You must also edit the 'suffix' variable in index.html to include any suffix besides the default or an empty suffix. -t <depth> Limit the "File Sizes", "Modified", and "File Types" totals to only <depth> directories deep in the report. This is useful if the directory being reported on has many files, which can cause the report to take a long time to generate. For example, if this is set to 3 the directory ./a, ./a/b and ./a/b/c will have these totals available, but ./a/b/c/d will not. The default is 6. -td <depth> Similar to -t but instead limits the "Top 100" list to only <depth> directories deep in the report. This is useful if the directory being reported on has many files, which can cause the report to take a long time to generate. The default is 3. -tz <timezone> Set the report timezone. These are the same timezones as http://php.net/manual/en/timezones.php. The default is the system's timezone (if it can be determined). -v Output additional information as the script executes. -vv Output more information than -v. Notes: o You should set the -tz option as trying to determine the system's timezone is unreliable. o You may execute process.php on a separate server than the 'find' script if you are worried about it using CPU time. o The directory separator used in <filelist> must be a forward slash if this script is executed on a *nix system. o If the <filelist> has a header line (starts with a #) then the -d and -ds OPTIONS will be ignored since the header explicitly defines what their values should be.
Here are some examples of its usage:
php scripts/process.php path/to/report/dir list-of-files.dat
cat list-of-files.dat | php scripts/process.php /var/www/html/usage/data/myreport
php scripts\process.php c:\path\to\report\dir list-of-files.dat
Here are some examples of combining the 'find' scripts and process.php into one command:
bash scripts/find.sh path/to/directory | php scripts/process.php path/to/report/dir
scripts\find.exe c:\path\to\directory | php scripts\process.php c:\Inetpub\wwwroot\diskusage\data\myreport
php scripts/find.php path/to/directory | php scripts/process.php /var/www/html/diskusage/data/myreport
Here are some examples of the OPTIONS:
php scripts/process.php -n "My Report" path/to/report/dir list-of-files.dat
Show the name of the report in the header. It will appear as "Disk Usage Report for: My Report".
php scripts/process.php -t 1 -td 1 diskusageinstall/data/myreport
Only show the "Files Sizes", "Modified" and "File Types" totals for the root directory. This can speed up the report generation if there are a lot of sub directories. This also cuts back on the total size of the report.
php scripts\process.php -tz "America/New_York" c:\path\to\report\dir list-of-files.dat
Set the timezone to EST.
php scripts/process.php -d ":" -ds "/" -fp -l 1024 -mt 819200 -n "My Report" -nt -ss 15 -su ".txt" -t 6 -td 6 -tz "America/New_York" - diskusageinstall/data/myreport - list-of-files.dat
An example of all OPTIONS in use.
Where to Save Reports ↑Top
By default, Disk Usage Reports will look for all reports within the
data directory of your installation. You can change this by editing the
reportsBaseURL variable in
Let's assume you installed Disk Usage Reports at
C:\Inetpub\wwwroot\reports for Windows).
You would want to save your reports as directories within
C:\Inetpub\wwwroot\reports\data for Windows).
Here are some examples of process.php with this in mind:
php scripts/process.php /var/www/html/reports/data/myreport list-of-files.dat
php scripts\process.php C:\Inetpub\wwwroot\reports\data\myreport list-of-files.dat
Viewing Reports ↑Top
URLs for viewing reports are in the following format:
By default, the
reportpath is relative to
http://mysite.com/path/to/diskusage/data/. You can change this by editing the
reportsBaseURL variable in
Let's continue the example in Where to Save Reports where we saved out report to
C:\Inetpub\wwwroot\reports\data\myreport on Windows).
Our URL for the report would be:
Which will load the report from:
Organizing Your Reports ↑Top
You can organize your reports into subdirectories as necessary.
For example, let's assume a report was created at
C:\Inetpub\wwwroot\reports\data\edu\mit on Windows). This report is for MIT and is organized into a directory called "edu".
You would view that report by browsing to
You can take this a step further by creating a historical archive of reports by organizing them by date.
For example, let's assume the report was created at
C:\inetpub\wwwroot\reports\data\edu\mit\2011-06 on Windows).
You would view that report by browsing to
Securing Reports ↑Top
It is possible that a person could guess the path to a report. For example, you could guess that the report for the mathematics department is at
An easy way to avoid this issue is by including extra characters in the report directory that act as a password.
For example, by naming the report directory math_diw9481 (which would be viewed at
http://mysite.com/reports?math_diw9481) you make it very difficult to guess the Web address for a report.
Optional Report Settings ↑Top
There are several optional settings that can be edited in
index.html. A description is included for each setting.
About Version Numbers: ↑Top
As of 1.0.0 the version numbers will now follow the Semantic Versioning guidelines at semver.org as closely as possible.
Releases will be numbered with the following format:
The following are some of the rules that will be followed:
- Breaking backwards compatibility will increase the
- New additions that do not break backwards compatibility will increase the
- Bug fixes and minor changes will increase the