1 2 Previous Next 28 Replies Latest reply on Mar 16, 2020 4:51 AM by Brice-Emmanuel Loiseaux

    Data is full. Tried performing compaction but ended with an error

    Hitesh Jha
      Share This:

      Hello All,

      when I executed the below command getting an error as below.

       

      [tideway@cltdvladmc01 ~]$ tw_ds_offline_compact --smallest-first

       

      db_recover: BDB3015 p0002_nClusterService_hist: write failed for page 9559

      db_recover: BDB3027 p0002_nClusterService_hist: unable to flush page: 9559

      db_recover: BDB0060 PANIC: fatal region error detected; run recovery

      db_recover: BDB3015 p0002_nClusterService_hist: write failed for page 9560

      db_recover: BDB3027 p0002_nClusterService_hist: unable to flush page: 9560

      db_recover: BDB0060 PANIC: fatal region error detected; run recovery

      db_recover: BDB3015 p0002_nClusterService_hist: write failed for page 9561

      db_recover: BDB3027 p0002_nClusterService_hist: unable to flush page: 9561

      db_recover: BDB0060 PANIC: fatal region error detected; run recovery

      db_recover: BDB3015 p0002_nClusterService_hist: write failed for page 9563

      db_recover: BDB3027 p0002_nClusterService_hist: unable to flush page: 9563

      db_recover: BDB0060 PANIC: fatal region error detected; run recovery

      db_recover: BDB3015 p0002_nClusterService_hist: write failed for page 9564

      db_recover: BDB3027 p0002_nClusterService_hist: unable to flush page: 9564

      db_recover: BDB0060 PANIC: fatal region error detected; run recovery

      db_recover: BDB3015 p0002_nClusterService_hist: write failed for page 9565

      db_recover: BDB3027 p0002_nClusterService_hist: unable to flush page: 9565

      db_recover: BDB0060 PANIC: fatal region error detected; run recovery

      db_recover: BDB3015 p0002_nClusterService_hist: write failed for page 9566

      db_recover: BDB3027 p0002_nClusterService_hist: unable to flush page: 9566

      db_recover: BDB1544 process-private: unable to find environment

      db_recover: DB_ENV->open: BDB0087 DB_RUNRECOVERY: Fatal error, run database recovery

       

      o/p of the tw_ds_offline_compact --fix-interrupted as below.

       

      [tideway@cltdvladmc01 ~]$ screen

      db_load: BDB0137 write: 0x267bbf890, 4096: No space left on device

      db_load: BDB3015 p0004_nDiscoveredCommandResult_pidx: write failed for page 22752213

      db_load: BDB3027 p0004_nDiscoveredCommandResult_pidx: unable to flush page: 22752213

      db_load: BDB0137 write: 0x38ccd05e0, 4096: No space left on device

      db_load: BDB3015 p0004_nDiscoveredCommandResult_pidx: write failed for page 22752214

      db_load: BDB3027 p0004_nDiscoveredCommandResult_pidx: unable to flush page: 22752214

      db_load: BDB0137 write: 0xa69dd290, 4096: No space left on device

      db_load: BDB3015 p0004_nDiscoveredCommandResult_pidx: write failed for page 22752215

      db_load: BDB3027 p0004_nDiscoveredCommandResult_pidx: unable to flush page: 22752215

      db_load: BDB0137 write: 0x1f2315450, 4096: No space left on device

      db_load: BDB3015 p0004_nDiscoveredCommandResult_pidx: write failed for page 22752216

      db_load: BDB3027 p0004_nDiscoveredCommandResult_pidx: unable to flush page: 22752216

      db_load: BDB0137 write: 0x31300d1b0, 4096: No space left on device

      db_load: BDB3015 p0004_nDiscoveredCommandResult_pidx: write failed for page 22752217

      db_load: BDB3027 p0004_nDiscoveredCommandResult_pidx: unable to flush page: 22752217

      db_load: BDB0137 write: 0x2554d6190, 4096: No space left on device

      db_load: dbenv->close: No space left on device

      2019-12-03 06:30:46: Failed to fix databases

        • 1. Re: Data is full. Tried performing compaction but ended with an error
          Duncan Grisby

          You need to free up some space on the filesystem that stores the data. You need some headroom first to perform a transaction recovery and then to write temporary files during the compaction. The compaction cannot work if there is no space at all.

          1 of 1 people found this helpful
          • 2. Re: Data is full. Tried performing compaction but ended with an error
            Lisa Keeler

            Hitesh, you are performing what we refer to as a "reverse compact", with the --smallest-first option.

             

            See this KA 000156775:

            BMC Discovery: What are the best practices for running a compact (via tw_ds_offline_compact) on the Discovery datastore files?

             

            And notice especially the section at the bottom:

             

            "

            What about Reverse Compact?

             

            The tw_ds_offline_compact utility compacts datastore files starting with the largest file and working down to the smallest file. If there is not enough free space in the datastore partition to compact the largest file, tw_ds_offline_compact will realize this and end quickly. In this scenario, it is possible to perform a "reverse compact" (starting with the smallest file and working up) by running tw_ds_offline_compact with the --smallest-first argument. This option is available in Discovery 11.1 and later versions.

             

            However, a reverse compact may not finish successfully, if the space regained by compacting smaller files is not sufficient to process the larger files as the compact progresses.

             

            If the reverse compact does not finish successfully, the services will not start, and customer must add a larger disk, move the datastore to the larger disk, and then finish the compact with tw_ds_offline_compact --fix-interrupted.  After that is done, the services will start.

            "

             

            So, at this point, you must do this:  add a larger disk (with enough space for the datastore plus alot of growth), move the datastore to the larger disk (using tw_disk_utils command because your UI is down I assume), and then finish the compact with tw_ds_offline_compact --fix-interrupted.  After that is done, the services will start.

             

            There is no other choice unless you happen to already have another disk on your appliance already with enough free space to contain the datastore plus room for compaction, plus room for growth.  And, I have never seen that be the case on anyone's appliance.  (that they happened to have a really large, mostly unused disk that wasn't already being used by the datastore).

             

            I would at least double the space on the new disk ... i.e. make it at least twice the size of the current, uncompacted datastore.

             

            For help with moving the datastore to the larger disk using tw_disk_utils, you should open a Support ticket.

             

            Your system will be completely down until all of this is done.... unless perhaps your datastore is in the /usr partition (and that would be odd seeing as Broadridge is a large site), and you manage to free up enough space in /usr to finish the compaction.

            (to do that, see my blog :  ADDM out of disk space in /usr   )

            And, if that is the case, then immediately move the datastore to a larger disk after the compaction finishes!

             

             

             

             

            2 of 2 people found this helpful
            • 3. Re: Data is full. Tried performing compaction but ended with an error
              Lisa Keeler

              Hitesh, in case you are trying to free up more space, be sure not to delete any datastore transaction log files.

              They are found here:

              /usr/tideway/var/tideway.db/log

               

              And, the files look something like this:

              log.0000000624

               

              So, they do not have a .log extension. Instead, they start with "log.nnnnnnn".

               

              Don't delete them.  If you do, then your datastore will not be recoverable.

              1 of 1 people found this helpful
              • 4. Re: Data is full. Tried performing compaction but ended with an error
                Hitesh Jha

                Hello Duncan,

                Do you mean I need to add separate raw disk to our appliance on VM level.

                But as checked this consolidator is physical  PowerEdge R730 server from Dell.

                • 5. Re: Data is full. Tried performing compaction but ended with an error
                  Lisa Keeler

                  Hitesh,

                  see the answer I provided (shortly after Duncan).  Yes, you have to add a new, larger disk to contain the datastore.  Details and KA are in my answer.

                  • 6. Re: Data is full. Tried performing compaction but ended with an error
                    Duncan Grisby

                    Well, you could add a new disk, but maybe there are things on the data disk that can be deleted / moved to make space. As Lisa says, if the transaction logs named log.nnnn are on the disk that is full, under absolutely no circumstances should you delete them. Other files though -- backups, debug logs (named *.log rather than log.*), etc. can be safely moved elsewhere.

                    1 of 1 people found this helpful
                    • 7. Re: Data is full. Tried performing compaction but ended with an error
                      Hitesh Jha

                      Hello Lisa,

                       

                      Thanks for your response.

                      So basically I need to request and bump up the size from 1.1 TB to 2 TB and then start the compaction activity .Correct ?

                      • 8. Re: Data is full. Tried performing compaction but ended with an error
                        Lisa Keeler

                        Hi Hitesh,

                         

                        If your physical hardware allows to extend the disk size safely, then I guess that works.  (Duncan?) 

                         

                        In Support, we only tell customers to add a new larger disk, and then move the datastore to the new disk using our provided utilities.

                         

                        Assuming all the data in the /mnt/addm/db_data is all datastore data, then there is nothing you can delete.

                        I would take a look in that directory to make sure there isn't something else in there.

                         

                        Lisa

                        • 9. Re: Data is full. Tried performing compaction but ended with an error
                          Hitesh Jha

                          Since this is a physical server and not a virtual one where we can add more space.

                          Can you please suggest if we can delete some files instead of compacting them?

                          • 10. Re: Data is full. Tried performing compaction but ended with an error
                            Duncan Grisby

                            As Lisa says, first find out what is using the space on /mnt/addm/db_data:

                             

                            du -h /mnt/addm/db_data

                             

                            If there is anything substantial other than database files or transaction logs, they can probably be moved / removed.

                             

                            Is the storage a physical disk, or SAN storage?  If it's SAN, you may be able to increase the volume size and then resize the filesystem.

                             

                            If you cannot easily extend the filesystem, you might be able to get yourself out of the situation by picking a large database file that fits comfortably in the 199GB you have in /usr. Move that file to /usr somewhere and replace it in the data/datadir directory it came from with a symbolic link to the new location of the file. That will hopefully mean that there is enough space for the compaction to run, thus freeing more space. The compaction writes new copies of all the files, so after a successful compaction, you should find a new version of the file that you moved is back in the data/datadir directory.

                            1 of 1 people found this helpful
                            • 11. Re: Data is full. Tried performing compaction but ended with an error
                              Hitesh Jha

                              Please refer the details as below.

                              [root@cltdvladmc01 ~]# du -h /mnt/addm/db_data

                              975G    /mnt/addm/db_data/data/datadir

                              975G    /mnt/addm/db_data/data

                              16K     /mnt/addm/db_data/lost+found

                              975G    /mnt/addm/db_data

                               

                              [root@cltdvladmc01 ~]# fdisk -l

                               

                               

                              Disk /dev/sda: 299.4 GB, 299439751168 bytes

                              255 heads, 63 sectors/track, 36404 cylinders

                              Units = cylinders of 16065 * 512 = 8225280 bytes

                              Sector size (logical/physical): 512 bytes / 512 bytes

                              I/O size (minimum/optimal): 512 bytes / 512 bytes

                              Disk identifier: 0x00041432

                               

                               

                                 Device Boot      Start         End      Blocks   Id  System

                              /dev/sda1   *           1          32      256000   83  Linux

                              Partition 1 does not end on cylinder boundary.

                              /dev/sda2              32        1077     8388608   82  Linux swap / Solaris

                              /dev/sda3            1077        1338     2097152   83  Linux

                              /dev/sda4            1338       36405   281678848    5  Extended

                              /dev/sda5            1338        1563     1810432   83  Linux

                              /dev/sda6            1564        1759     1572864   83  Linux

                              /dev/sda7            1759        1890     1048576   83  Linux

                              /dev/sda8            1890        2004      917504   83  Linux

                              /dev/sda9            2005        2119      917504   83  Linux

                              /dev/sda10           2119       36405   275405824   83  Linux

                               

                               

                              WARNING: GPT (GUID Partition Table) detected on '/dev/sdb'! The util fdisk doesn't support GPT. Use GNU Parted.

                               

                               

                               

                               

                              Disk /dev/sdb: 1197.8 GB, 1197759004672 bytes

                              255 heads, 63 sectors/track, 145619 cylinders

                              Units = cylinders of 16065 * 512 = 8225280 bytes

                              Sector size (logical/physical): 512 bytes / 512 bytes

                              I/O size (minimum/optimal): 512 bytes / 512 bytes

                              Disk identifier: 0x00000000

                               

                               

                                 Device Boot      Start         End      Blocks   Id  System

                              /dev/sdb1               1      145620  1169686527+  ee  GPT

                              • 12. Re: Data is full. Tried performing compaction but ended with an error
                                Duncan Grisby

                                OK, so all the space on that disk really is taken up with datastore files. There is nothing you can safely delete. Your options are therefore to provision more storage if that's possible, or to temporarily move a big file and symbolic link it back as I mentioned before.

                                1 of 1 people found this helpful
                                • 13. Re: Data is full. Tried performing compaction but ended with an error
                                  Hitesh Jha

                                  Hello Duncan,

                                   

                                  Thanks for your response.Just to summarize our discussion ...as below

                                  I need to move the below highlighted datastore files using the softlink and move to /usr  and start the compaction activity,Correct ?

                                  [tideway@cltdvladmc01 datadir]$ du . -a |sort -n -r|head -n 5

                                  1021415832      .

                                  90993292        ./p0004_nDiscoveredCommandResult_pidx

                                  70681380        ./p0004_rList_state

                                  50101960        ./p0002_nHost_hist

                                  43280444        ./p0004_rInference_state

                                  [tideway@cltdvladmc01 datadir]$

                                  • 14. Re: Data is full. Tried performing compaction but ended with an error
                                    Duncan Grisby

                                    Yes, just moving that one big p0004_nDiscoveredCommandResult_pidx file is likely to be sufficient. Put it in /usr/tideway, then make a symbolic link back to it (with the same name of course!).

                                    1 2 Previous Next