PHP session_start() failed no space left on device (Plesk plesk-php-cleanuper)

I kept receiving intermissive PHP warnings saying some thing like E_WARNING: session_start(): open(/var/lib/php/session/sess_ji9k4chqke3pde98a5n5m1vca5, O_RDWR) failed: No space left on device (28). Usually this would indicate that the disk where the sessions were being stored is full. The best place to start is to check if the disk where those session files are being stored is in fact NOT full.

// to check disk space usage
df -h

// should display something like this
Filesystem            Size  Used Avail Use% Mounted on
/dev/md1              4.0G  883M  3.2G  22% /
/dev/mapper/vg00-usr  4.0G  1.8G  2.0G  49% /usr
/dev/mapper/vg00-var  890G  231G  614G  28% /var
none                  5.9G  116K  5.9G   1% /tmp

In my case the disk where the sessions were being stored had plenty of space - 600GB free!

It turns out the issue was caused by Plesk's PHP hourly script /etc/cron.hourly/plesk-php-cleanuper was not able to finish due to the current sessions directly being too full. My guess is some where in the update from Plesk 10.4 to 11 the script got turned off or removed. Then a minor update turned it back on. But during that time the sessions folder grew to around 800MB of tiny session files. My guess is the average session file is less than 1 Kb. So there must have been over a million small session files. The Plesk script plesk-php-cleanuper when turned back on could never finish clearing the old session files.

tl;dr - now to the fix

# 1. Gracefully turn off Apache so no incoming request come during these changes.
apachectl -k graceful-stop

# 2. Rename the current PHP session directory
mv /var/lib/php/session /var/lib/php/session.old

# 3. Recreate PHP session directory and set permissions
mkdir /var/lib/php/session
chmod 1777 /var/lib/php/session

#4. Start Apache
apachectl -k start
 
# 5. Delete old session files (optional)*
mkdir /var/lib/php/empty
rsync -a --delete /var/lib/php/empty/ /var/lib/php/session.old/

Step 5 is optional since the server I was working on I left the rsync command run for around 24 hours and it still did not finish. The rsync will eat up your disk I/O and this was a production server so I could not have it run slow for that long - plus disk space was not really an issue. But if you need that disk space and can allow your server to run slow for a day or two, the rsync method is the fastest way to delete millions of files.


comments powered by Disqus