How to Fix: Linux Filesystem has Errors, MySql Database Table Crashed

Dennis Faas's picture

Infopackets Reader Steve P. writes:

" Dear Dennis,

The company I work for is renting a dedicated Linux web server, which runs CentOS 6.5 and PHPList to manage their mailing list. Recently we were faced with a power outage on the server which then caused corruption in the PHPList database. When I look at mysqld.log I see a lot of error messages that 'mysqld: Table './phplist/phplist_usermessage' is marked as crashed and should be repaired'. I checked the database drive for errors using 'e2fsck -n /dev/sdb1' and am getting an error message that '/dev/sdb1: ********** WARNING: Filesystem still has errors **********'. I understand I need to check the drive for errors and then repair the database if this is possible but Linux isn't really my forte. Can you help? "

My response:

I asked Steve if he would like me to connect to his desktop to have a closer look at the issues using my remote desktop support service, and he agreed.

I have been running a dedicated Linux server for over 14 years and have plenty of experience with managing websites and also PHPList, though to be fair I don't have much experience using fsck or e2fsck, which are similar to 'chkdsk' in Windows (which repairs filesystem errors) - which are often the cause of an improper shutdown.

Basically, if the filesystem is corrupt, then data contained on the filesystem is likely also corrupt and will continue to corrupt if it is not corrected. Once the filesystem integrity is corrected, the damaged mysql table can be repaired - if it is repairable. If it is not repairable then the only option would be to export the table from a very recent backup and re-import.

Surprisingly I could not find any guides online that would explain how to go about fixing this step-by-step, and especially if you use more than one drive on the server, and also if you are using a web-centric environment. As such I've decided to put together the steps I took to resolve the issue, in case anyone reading this has similar circumstances.

How to Fix: Linux Filesystem has Errors, MySql Database Table Crashed

Keep in mind that we are using CentOS here, which is similar to Redhat Enterprise. If you're using a different Linux distribution, you will need to substitute commands where necessary.

  1. The first thing you will want to do is look and see when the last time the disk(s) were checked for consistency to ensure the filesystem is not corrupt. In this case I'm using /dev/sda1 which is the operating system, /dev/sdb1 which is the database drive, and /dev/sdc1 which is the backup drive.

    Note my comments always start with four hashes.

    ####grep "last checked" using tune2fs
    tune2fs -l /dev/sda1 | grep Last\ c; tune2fs -l /dev/sdb1 | grep Last\ c; tune2fs -l /dev/sdc1 | grep Last\ c

    ###output
    Last checked: Wed Feb 1 16:21:50 2017
    Last checked: Mon Dec 12 13:45:41 2017
    Last checked: Sat Feb 24 12:14:11 2017
     
  2. Judging by the above output, the drives haven't been checked recently for corruption. As such we will want to force fsck on reboot. Typically this is done using the following command(s):

    ####force fsck on operating system drive (similar commands)
    touch /forcefsck
    shutdown -rF now

    This type of consistency check works fine if you only use one drive / one partition. However, in this case we are using 3 different drives. Upon reboot I ran the command again and noted that the other two drives were still not checked:

    ####grep "last checked" using tune2fs
    tune2fs -l /dev/sda1 | grep Last\ c; tune2fs -l /dev/sdb1 | grep Last\ c; tune2fs -l /dev/sdc1 | grep Last\ c

    ###output
    Last checked: Sat May 6 19:29:50 2017  <=== checked for consistency
    Last checked: Mon Dec 12 13:45:41 2016  <=== still not checked
    Last checked: Sat Feb 24 12:14:11 2017  <=== still not checked
     
  3. At this point I decided to check the drives in read-only mode. This is safe to do even if the drive is currently being used by another process. You will never, ever want to run fsck or e2fsck (usually these are the same program anyway) on a drive that is in use or you will severely corrupt the data.

    Here is the command to check for file system errors in read only mode:

    ####check only sdb1, sdc1 in read only mode
    e2fsck -n /dev/sdb1
    e2fsck -n /dev/sdc1

    Upon running the last command I saw the output:

    e2fsck 1.41.12 (17-May-2010)
    Warning! /dev/sdb1 is mounted.
    Warning: skipping journal recovery because doing a read-only filesystem check.
    /dev/sdb1 contains a file system with errors, check forced.
    Pass 1: Checking inodes, blocks, and sizes
    Pass 2: Checking directory structure
    Pass 3: Checking directory connectivity
    Pass 4: Checking reference counts
    Pass 5: Checking group summary information
    Free blocks count wrong (963597, counted=963596).
    Fix? no
    Free inodes count wrong (507502, counted=507501).
    Fix? no
    /dev/sdb1: ********** WARNING: Filesystem still has errors **********
     
  4. Once again it would be ideal to run fsck during a reboot, but I was not sure how to do this on a non-root filesystem. After a bit of research I came up with two solutions. One is to modify the fstab (filesystem table) to check the filesystem for errors on a mounted drive every X reboots. You can do this by modifying the last column for the mount point entry, which is the "maximum mount count". By default the value is 0. In this case you would want to change it to a 1.

    Here is an example:

    nano -w /etc/fstab

    ####was:
    /dev/sdb1 /mnt/db ext4 auto,rw 0 0
    /dev/sdc1 /backups ext4 auto,rw 0 0

    ####now:
    /dev/sdb1 /mnt/db ext4 auto,rw 0 1
    /dev/sdc1 /backups ext4 auto,rw 0 1

    Once that is done you will want to reboot.

    Another way to do this is to use the 'tune2fs' command:

    ####we can set fsck to check filesystem after every 5 reboots:
    tune2fs -c 5 /dev/sdb1
    tune2fs -c 5 /dev/sdc1
     
  5. Upon rebooting I ran e2fsck command in read only mode to check for consistency:

    e2fsck -n /dev/sdc1

    However I was still receiving:

    /dev/sdb1: ********** WARNING: Filesystem still has errors **********

    In this case, I decided to see if there were any meaningful logs I could read regarding fsck. After a bit of research I found the following commands to reveal information about fsck:

    ####fsck doesn't have logs - check /var/log/*
    grep -A 1 fsck /var/log/*
    cat /var/log/boot.log

    Unfortunately nothing I came across in these log files stated anything meaningful.
     
  6. At this point I did a bit more digging and discovered that the "WARNING: Filesystem still has errors" message may be erroneous because the filesystem being scanned is still in use. To test this theory I decided to unmount the /dev/sdb1 and scan with the operating system loaded.

    ####device is busy when trying to unmount
    umount /mnt/db
    umount: /mnt/db: device is busy.
    (In some cases useful info about processes that use the device is found by lsof(8) or fuser(1))

    As you can see I was not able to unmount /dev/sdb1 (which has a mount point located at /mnt/db). If you notice the error message also says to look at "lsof" or "fuser" for more help. A bit of research on these topics provided me with the information I needed. The 'fuser' program allows you to see which processes are attached to a particular mount point. How incredibly useful! To put it plainly: I was not able to unmount /mnt/db because another program was currently loaded into memory and using that mount point.

    ####see which process is using the mount point
    fuser -vm /mnt/db

    ####output:
    USER PID ACCESS COMMAND
    /mnt/db: root 3617 f.... fail2ban-server

    As you can see here the fail2ban-server was using the /mnt/db mount point. Stopping the fail2ban server should allow me to unmount the drive successfully:

    ####stop the fail2ban service
    service fail2ban stop

    Stopping fail2ban: [ OK ]

    ####now I can successfully unmount /mnt/db!
    umount /mnt/db

    ###and, now I can run fsck on the drive without worrying about corrupting anything
    fsck /dev/sdb1

    fsck from util-Linux-ng 2.17.2
    e2fsck 1.41.12 (17-May-2010)
    /dev/sdb1 has been mounted 1 times without being checked, check forced.
    Pass 1: Checking inodes, blocks, and sizes
    Pass 2: Checking directory structure
    Pass 3: Checking directory connectivity
    Pass 4: Checking reference counts
    Pass 5: Checking group summary information
    /dev/sdb1: 16786/524288 files (1.6% non-contiguous), 1133299/2096896 blocks

    As you can see from the above output, there are no errors.

    I also decided to run e2fsck in read only mode (as I did before) to see if I would receive "WARNING: Filesystem still has errors" again:

    e2fsck -n /dev/sdb1

    e2fsck 1.41.12 (17-May-2010)
    /dev/sdb1: clean, 16786/524288 files, 1133299/2096896 blocks (check after next mount)

    As you can see here, we have a clean bill of health. Nothing else needed to be done, but to test my theory that original error message was erroneous I decided to remount the drive and run e2fsck in read only mode again:

    ####remount all mount points according to fstab
    mount -a

    ####check again using e2fsck in read only mode
    ####this will give an error because the device is in use, but I know that the filesystem is fine
    e2fsck -n /dev/sdb1
    e2fsck 1.41.12 (17-May-2010)
    Warning! /dev/sdb1 is mounted.
    Warning: skipping journal recovery because doing a read-only filesystem check.
    /dev/sdb1 contains a file system with errors, check forced.
    Pass 1: Checking inodes, blocks, and sizes
    Pass 2: Checking directory structure
    Pass 3: Checking directory connectivity
    Pass 4: Checking reference counts
    Pass 5: Checking group summary information
    Free blocks count wrong (963597, counted=963596).
    Fix? no
    Free inodes count wrong (507502, counted=507501).
    Fix? no
    /dev/sdb1: ********** WARNING: Filesystem still has errors **********
    /dev/sdb1: 16786/524288 files (1.6% non-contiguous), 1133299/2096896 blocks

    FILE SYSTEM IS OK! this error message is erroneous because file system is in use.
     
  7. After all of that, it's now time to check mysql databases and repair where necessary. You might want to stop any services running mysql especially if you have a busy server, such as apache web server:

    ####stop apache temporarily while running mysql checks
    service httpd stop

    The below command will set the $ROOT_DB_PW so that I don't have to keep typing it into the command line.

    ####set the root database password for mysql so we can fire off cmdlets below
    ROOT_DB_PW=enter_your_root_db_password_here

    ####check the databases for corruption
    mysqlcheck -c company -u root -p$ROOT_DB_PW
    mysqlcheck -c mail -u root -p$ROOT_DB_PW
    mysqlcheck -c phplist -u root -p$ROOT_DB_PW

    The last command I ran (which checked the phplist database) reported errors. At this point I decided to check the mysql log for errors:

    ####look at the mysql log for errors:
    tail /var/log/mysqld.log -n25

    170508 14:31:22 [ERROR] /usr/libexec/mysqld: Table './phplist/phplist_usermessage' is marked as crashed and should be repaired
    170508 14:31:22 [ERROR] /usr/libexec/mysqld: Table './phplist/phplist_usermessage' is marked as crashed and should be repaired
    170508 14:31:22 [ERROR] /usr/libexec/mysqld: Table './phplist/phplist_usermessage' is marked as crashed and should be repaired
    170508 14:31:22 [ERROR] /usr/libexec/mysqld: Table './phplist/phplist_usermessage' is marked as crashed and should be repaired
    170508 14:31:22 [ERROR] /usr/libexec/mysqld: Table './phplist/phplist_usermessage' is marked as crashed and should be repaired
    170508 14:31:22 [ERROR] /usr/libexec/mysqld: Table './phplist/phplist_usermessage' is marked as crashed and should be repaired
    170508 14:31:22 [ERROR] /usr/libexec/mysqld: Table './phplist/phplist_usermessage' is marked as crashed and should be repaired
    170508 14:31:22 [ERROR] /usr/libexec/mysqld: Table './phplist/phplist_usermessage' is marked as crashed and should be repaired
    170508 14:31:22 [ERROR] /usr/libexec/mysqld: Table './phplist/phplist_usermessage' is marked as crashed and should be repaired
    170508 14:31:22 [ERROR] /usr/libexec/mysqld: Table './phplist/phplist_usermessage' is marked as crashed and should be repaired

    Now that I know for sure we have errors on the database, it's time to repair the table. This can be done using the 'repair table tablename;' command via the command line. In this case I'm repairing the database 'phplist', having a corrupt table 'phplist_usermessage'.

    ####repair phplist_usermessage from database phplist
    mysql -u root -p$ROOT_DB_PW -e "repair table phplist.phplist_usermessage;"

    Once that was completed, I ran mysqlcheck and it reported everthing was "OK":

    mysqlcheck -c phplist -u root -p$ROOT_DB_PW

At this point I was satisfied that the drive had no errors and the database was corrected, so I restarted rebooted the server and all was well again.

I hope this helps anyone facing the same issue.

Additional 1-on-1 Support: From Dennis

As you can see from the instructions here, I had to do a bit of research along the way because this is not something I have to do on a regular basis. At any rate we managed to get everything resolved and Steve was very happy with my remote desktop service. That said, if anyone reading this is having a similar problem and needs help with this or anything to with Linux (or Windows for that matter), I can help. Simply contact me, briefly describing the issue and I'll get back to you as soon as possible.

Got a Computer Question or Problem? Ask Dennis!

I need more computer questions. If you have a computer question - or even a computer problem that needs fixing - please email me with your question so that I can write more articles like this one. I can't promise I'll respond to all the messages I receive (depending on the volume), but I'll do my best.

About the author: Dennis Faas is the owner and operator of Infopackets.com. With over 30 years of computing experience, Dennis' areas of expertise are a broad range and include PC hardware, Microsoft Windows, Linux, network administration, and virtualization. Dennis holds a Bachelors degree in Computer Science (1999) and has authored 6 books on the topics of MS Windows and PC Security. If you like the advice you received on this page, please up-vote / Like this page and share it with friends. For technical support inquiries, Dennis can be reached via Live chat online this site using the Zopim Chat service (currently located at the bottom left of the screen); optionally, you can contact Dennis through the website contact form.

Rate this article: 
Average: 5 (3 votes)