Tuesday, November 9, 2010

To take care of overflow in var, root, filesystem in AIX

http://publib.boulder.ibm.com/infocenter/aix/v6r1/index.jsp?topic=/com.ibm.aix.baseadmn/doc/baseadmndita/fsvarover.htm
Resolving overflows in the /var file system
Check the following when the /var file system has become full.

You can use the find command to look for large files in the /var directory. For example:
find /var -xdev -size +2048 -ls| sort -r +6
For detailed information, see the command description for the find command.

Check for obsolete or leftover files in /var/tmp.
Check the size of the /var/adm/wtmp file, which logs all logins, rlogins and telnet sessions. The log will grow indefinitely unless system accounting is running. System accounting clears it out nightly. The /var/adm/wtmp file can be cleared out or edited to remove old and unwanted information. To clear it, use the following command:
cp /dev/null /var/adm/wtmp
To edit the /var/adm/wtmp file, first copy the file temporarily with the following command:
/usr/sbin/acct/fwtmp < /var/adm/wtmp >/tmp/out
Edit the /tmp/out file to remove unwanted entries then replace the original file with the following command:
/usr/sbin/acct/fwtmp -ic < /tmp/out > /var/adm/wtmp
Clear the error log in the /var/adm/ras directory using the following procedure. The error log is never cleared unless it is manually cleared.
Note: Never use the cp /dev/null command to clear the error log. A zero-length errlog file disables the error logging functions of the operating system and must be replaced from a backup.
Stop the error daemon using the following command:
/usr/lib/errstop
Remove or move to a different filesystem the error log file by using one of the following commands:
rm /var/adm/ras/errlog
or
mv /var/adm/ras/errlog
filename
Where filename is the name of the moved errlog file.

Note: The historical error data is deleted if you remove the error log file.
Restart the error daemon using the following command:
/usr/lib/errdemon
Note: Consider limiting the errlog by running the following entries in cron:
0 11 * * * /usr/bin/errclear -d S,O 30
0 12 * * * /usr/bin/errclear -d H 90
Check whether the trcfile file in this directory is large. If it is large and a trace is not currently being run, you can remove the file using the following command:
rm /var/adm/ras/trcfile
If your dump device is set to hd6 (which is the default), there might be a number of vmcore* files in the /var/adm/ras directory. If their file dates are old or you do not want to retain them, you can remove them with the rm command.
Check the /var/spool directory, which contains the queueing subsystem files. Clear the queueing subsystem using the following commands:
stopsrc -s qdaemon
rm /var/spool/lpd/qdir/*
rm /var/spool/lpd/stat/*
rm /var/spool/qdaemon/*
startsrc -s qdaemon
Check the /var/adm/acct directory, which contains accounting records. If accounting is running, this directory may contain several large files.
Check the /var/preserve directory for terminated vi sessions. Generally, it is safe to remove these files. If a user wants to recover a session, you can use the vi -r command to list all recoverable sessions. To recover a specific session, usevi -r filename.
Modify the /var/adm/sulog file, which records the number of attempted uses of the su command and whether each was successful. This is a flat file and can be viewed and modified with a favorite editor. If it is removed, it will be recreated by the next attempted su command. Modify the /var/tmp/snmpd.log, which records events from the snmpd daemon. If the file is removed it will be recreated by the snmpd daemon.
Note: The size of the /var/tmp/snmpd.log file can be limited so that it does not grow indefinitely. Edit the /etc/snmpd.conf file to change the number (in bytes) in the appropriate section for size.


Issue a find - command to select those files older than e.g. 8 days and delete them.
This command can be put into the ctontab file and be executed on a daily basis.

00 04 * * * find /var/adm/cron/log -ctime +8 -exec rm -f {} \;
(will delete all files older then 8 days, every day at 4am)

The safe way per Brooks to clean the mails:
type mail
d *
This will remove all the mails.

In ors0
cat /dev/null > /var/adm/cron/log

In ors3
cat /dev/null > /var/adm/cron/log


http://groups.google.com/group/comp.unix.aix/browse_thread/thread/734160493ba4c9e2?pli=1

http://publib.boulder.ibm.com/infocenter/aix/v6r1/index.jsp?topic=/com.ibm.aix.baseadmn/doc/baseadmndita/fsvarover.htm

How to clean the root overflow?
http://publib.boulder.ibm.com/infocenter/aix/v6r1/index.jsp?topic=/com.ibm.aix.baseadmn/doc/baseadmndita/fsvarover.htm
Check the following when the root file system (/) has become full.

# df -m
# lsvg -p rootvg
# lsvg -l rootvg
# lsvg rootvg
I would recommend increasing the size of the / (root) directory. The /tmp directory is only 4% used so don't need to worry about that yet.
# chfs -a size=#M /

for example, to increase root to 256 Mb
# chfs -a size=256M /

Use the following command to read the contents of the /etc/security/failedlogin file:
who /etc/security/failedlogin
The condition of TTYs respawning too rapidly can create failed login entries. To clear the file after reading or saving the output, execute the following command:
cp /dev/null /etc/security/failedlogin
Check the /dev directory for a device name that is typed incorrectly. If a device name is typed incorrectly, such as rmto instead of rmt0, a file will be created in /dev called rmto. The command will normally proceed until the entire root file system is filled before failing. /dev is part of the root (/) file system. Look for entries that are not devices (that do not have a major or minor number). To check for this situation, use the following command:
cd /dev
ls -l | pg
In the same location that would indicate a file size for an ordinary file, a device file has two numbers separated by a comma. For example:
crw-rw-rw- 1 root system 12,0 Oct 25 10:19 rmt0
If the file name or size location indicates an invalid device, as shown in the following example, remove the associated file:
crw-rw-rw- 1 root system 9375473 Oct 25 10:19 rmto
Note:
Do not remove valid device names in the /dev directory. One indicator of an invalid device is an associated file size that is larger than 500 bytes.
If system auditing is running, the default /audit directory can rapidly fill up and require attention.
Check for very large files that might be removed using the find command. For example, to find all files in the root (/) directory larger than 1 MB, use the following command:
find / -xdev -size +2048 -ls |sort -r -n +6
This command finds all files greater than 1 MB and sorts them in reverse order with the largest files first. Other flags for the find command, such as -newer, might be useful in this search. For detailed information, see the command description for the find command.
Note: When checking the root directory, major and minor numbers for devices in the /dev directory will be interspersed with real files and file sizes. Major and minor numbers, which are separated by a comma, can be ignored.
Before removing any files, use the following command to ensure a file is not currently in use by a user process:
fuser
filename
Where filename is the name of the suspect large file. If a file is open at the time of removal, it is only removed from the directory listing. The blocks allocated to that file are not freed until the process holding the file open is killed.

Disk overflows
A disk overflow occurs when too many files fill up the allotted space. This can be caused by a runaway process that creates many unnecessary files.

You can use the following procedures to correct the problem:

Note: You must have root user authority to remove processes other than your own.
Identifying problem processes
Use this procedure to isolate problem processes.
Terminating a process
You can terminate problem processes.
Reclamation of file space without terminating a process
To reclaim the blocks allocated to an active file without terminating the process, redirect the output of another command to the file. The data redirection truncates the file and reclaims the blocks of memory.
/ (root) overflow
Check the following when the root file system (/) has become full.
Resolving overflows in the /var file system
Check the following when the /var file system has become full.
Fix other file systems and general search techniques
Use the find command with the -size flag to locate large files or, if the file system recently overflowed, use the -newer flag to find recently modified files.

Command for cleaning up file systems
automatically

Use the skulker command to clean up file systems by removing unwanted files.

Type the following from the command line:
skulker -p
The skulker command is used to periodically purge obsolete or unneeded files from file systems. Candidates include files in the /tmp directory, files older than a specified age, a.out files, core files, or ed.hup files. For more information about the skulker command, see skulker.
The skulker command is typically run daily, as part of an accounting procedure run by the cron command during off-peak hours.