Tutorials

Efficiently Migrate Large Amounts of Raw Data Using rysnc and dd

Table of Contents

Introduction

In the daily work of an administrator or a PC user, the task of transferring the contents of an entire hard disk poses a challenge from time to time. This can happen, for example, when creating a backup or transferring the content of a storage medium byte by byte into the cloud. Standard hard disks now have a capacity of ten terabytes and more. This data volume is not easy to transport over the available bandwidth of usual internet connections. It can be a very tedious task even for local networks. This article first briefly describes conventional ways and means of dealing with the problem. The requirements and the substeps for solving the problem that is to be provided by the program ddtransfer.sh are derived from the specification of the limitations of these methods. Finally, there is an outlook for the further development of the program. The script can be found at the end of this article.

Traditional options of data migration

Copying, transferring, and importing the data of an USB stick with a size of 16 GB, for instance, takes 67 minutes during normal working hours in Berlin using a 100 MB connection, which is shared by more than 120 employees at the same time.

dd -> Copy -> dd

The following steps were required:

Creating the image

root@ddtransfer:~# time dd if=/dev/sdc bs=1G iflag=fullblock of=sdc.img

Duration: 12 minutes, 25 seconds

Transferring the image

```root@ddtransfer:~# time rsync --compress --compress-level=9 sdc.img root@158.222.102.233:/mnt/1/.


Duration: 51 minutes, 10 seconds

### Importing the data into a storage volume

```root@las-transfer2:/mnt/1# time dd if=sdc.img of=/dev/vdd bs=1G iflag=fullblock

Duration: 3 minutes, 28 seconds

CloneZilla

I canceled a test with CloneZilla as a cross check, because I would have had to activate password authentication and this is not convenient for me. I do not want to undermine basic security measures as I have experienced quite often that such workarounds are often forgotten and end up with being hacked instances – especially when easy ad hoc passwords are used.

bbcp

With Bar Bar Copy (bbcp) the data transmission can be accelerated considerably on the one hand, but the data would have been transmitted unencrypted. On the other hand, the goal is to transfer very large files. However, if images are created beforehand, a large amount of disk space must be made available in advance (even if the files are compressed). The transfer script requires only 10 GB free space on the hard disk, although more space, e.g. 100 GB, is quite recommended.

Send via Mail

Another way which is offered to customers for the transmission of large amounts of data is the dispatch of hard disks or a server (starting from approx. 10 TB), which are then sent to the provider. For more then 10 TB of data this alternative may still be the more practicable one, but with the further technical development it is quite foreseeable that ddtransfer.sh is also a feasible way with these data volumes.

Acronis

Furthermore, I tested the creation of an online backup with Acronis, using a Strato online backup account. Pros: deduplication and compression of the backup. Cons: client must be installed separately, raw devices cannot be directly imported, but need a boot CD of the target machine and an account is required. Conclusion: The speed of this method is quite good but, as said, an account is necessary.

Requirements for a new program

With an image size of 15 GB, the procedure described above – dd to file, copying over the network, dd to a device – is completely normal and appropriate. The actual problems only appear in case of

  1. a considerably larger amount of data
  2. which is to be transferred over a small bandwidth
  3. which additionally raises the question of data integrity

The duration of the transfer can easily take several days. If a failure occurs during this time, the transmission process must be restarted. It is obvious that the less data needs to be retransmitted, the less time is lost.

Once the transfer of a partial step has been completed, the creation of checksums at the very beginning and following the import of the files can determine whether a bit flip or other changes have occurred in the data stock. In previous procedures, this examination can only be carried out as an overall examination; if a change is detected, it is quite more annoying at the end of a very long process.

The low bandwidth of a normal internet access is the part that costs most of the processing time. However, very few programs are able to take full advantage of even that bandwidth. An effective way to make even more extensive use of bandwidth is to call parallel connections. The respective connections are by no means processed at the same speed. It is not uncommon that when 16 connections are started one after another with a few seconds interval, the tenth transmission is completed first.

The script ddtransfer.sh was written to close these gaps:

  1. The data to be transferred is divided into blocks that can be copied with multiple dd-invocations in parallel.
  2. A checksum is formed for each block during its creation.
  3. Depending on the bandwidth and number of hops up to the target computer, many parallel connections are used for sending the blocks.
  4. While further blocks are being transferred, the blocks that have already arrived on the target system are written to the target device, again several times in parallel if possible.
  5. After writing each transfer to the target device, a new checksum is formed from the finished block.
  6. At the end the checksums are compared. If there are differences, the affected block is transferred and calculated again.
  7. By logging the individual steps precisely, a transfer that has been started can be resumed after an interruption on the block at which the interruption occurred.

As far as I know, no available tool for the transfer of large devices currently meets all these conditions. The program ddtransfer.sh is based on ‘dd’, which belongs to the so-called core utilities, so it should run on any Linux that offers a shell, also ‘rsync’ and the ‘ssh agent’ should be present. Thus it can be used with a common Live-Linux, for example to transfer Windows installations. It can also be used on small hardware such as the Raspberry Pi, which can serve as a transmission station.

The real sticking point of the whole procedure is the internet connection. Using the script in local networks does not necessarily make sense. Under the condition of a sufficiently large bandwidth, the time for the entire transmission could even be extended. However, if the amount of data is very high and the checksum calculation is important, the program should also be useful in the LAN. The difference according to the original intention of the program is brief:

  1. Local: Fast data import in a few processes
  2. Internet: Slow transfer in many processes
  3. Remote: Fast writing of data to the respective Device with few processes

Accordingly, there are three main functions in the script:

  1. CreateImageFiles: Create the parts of the image as individual files and create the input checksum.
  2. SendFiles: Transfer files with Rsync.
  3. ImageToTargetDevice: Import of the files and subsequent creation of the output checksum from the imported data.

The ShowProceeding function is used for monitoring and final processing, i.e. the checksums are finally compared and individual parts are again created, transferred and imported if necessary.

The other functions

  1. dd_command,
  2. RemoteWorkspace,
  3. Transfer,
  4. RemoteStatusImageToTargetDevice and
  5. ReImage

represent substeps and are invoked by the main functions.

Resume after interruption

When the target computer restarts, the script continues to run without interruption. If the source computer or the target computer fails for a longer period of time, the program can be executed again. As an option, the name of a report file must be given, which contains the necessary variable values of the previous call. Then the status of the data transfer and the blocks created is checked. Where necessary, interrupted sub processes are restarted and the entire process continues. The suffixes of the file names can also be used to determine how far processing has already progressed:

  • ’run’ for still to process,
  • ’transfer’ as ready for transmission,
  • ’ongoing’ for being in transmission and
  • ’ToDev’ for a file that is in the process of writing remotely.

The main means for the program continuation is the report file which is read and then updated. Formerly started transfers are initiated again. Continuing started Rsync calls is technically possible, but would have required rewriting some functions, which seemed to me to be too complicated under the given circumstances, because the development of the script was already too advanced.

Challenges

Disk space

There is always the possibility that the disk is filled up to 100% locally or remotely. This means that enough temporary space is required to buffer the data blocks. By continuously checking the space still available and by calculating the expected occupancy with new block files that are being processed, the full run is controlled. The compression of the files is not used because it is passed as an option to the rsync calls.

For example, the transmission of one terabyte from my office took about 19 hours on weekdays and about 16 hours on weekends. The difference in the available bandwidth was also noticeable from the early evening onward in which the temporary disk space was initially fully used locally, later, with more free bandwidth, considerably more is needed on the remote station.

Bandwidth

If the program is called with many connections, the throughput for all others using the same network slows down. But that also means, as just described, that the transmission at night or at the weekend significantly accelerates the process of data migration.

Mode of operation of ‘dd’

‘dd’ can only either read or write. For this reason, it is advantageous to process reasonably large blocks in one work step. At the same time, this program can address specific positions of a file or block device blockwise or bytewise. This makes it the only tool that is suitable for our purpose. Precise positioning is essential for splitting the data to be transmitted into clearly defined individual steps. The block and file size in ddtransfer.sh is always a multiple of the minimum block size of the file system of 512 bytes. Up to now I have only allowed one dd process per processor core (if there is only one core, there are two).

The way of working mentioned at the beginning of this section makes it somewhat cumbersome to determine whether ‘dd’ will still write to a file or whether the respective process has already been completed. While ‘dd’ still reads from the device – the larger the selected read block size, the longer the read process – the target file does not change and is also not considered open, so it cannot be checked by ‘lsof’. Based on the size of a file, this may have to be calculated in advance and then checked continuously. During the development of the script it has therefore proven to be useful to completely restart image block write processes after an interruption. The ‘status=progress’ option is offered in new implementations of ‘dd’, but it is not always available and I have not yet the time to check whether it can be used. Furthermore, a loss of time is compensated by the transfer time of the files when a ‘dd’ call is restarted.

‘ssh’ and ‘rsync’

Frequent ssh calls are necessary to execute the remote commands and control the processes. It turned out that the number of possible connections can also be a scarce resource. In order to avoid bottlenecks, various calls were combined to make the script more efficient. For further development, it would make sense to optimize this even more. Two new ssh connections are opened for each start of an rsync process. If one of the instances involved reboots, the rsync processes get stuck and interfere with subsequent calls. It remains to be seen which solution is most appropriate here. So far I have helped myself with restarting the VMs involved.

Invoking the command

The script should be called directly from a root shell. An ssh agent is also required to load a private ssh key. The login must also be possible remotely as ‘root’. This could be an example:

./ddtransfer.sh --local_device /dev/sdc --remote_device /dev/vdd --TargetHost 46.16.76.151 --remote_transfer_dir /mnt --keep_logs

In this case is

  • ’/dev/sdc‘ the local device to transmit
  • ‘/dev/vdd‘ the storage volume of an addressed virtual machine the data will be written to
  • ‘46.16.76.151‘ as IP of this VM
  • ‘/mnt‘ mentions the remote directory to be used for the image files temporarily. Such a directory could be used locally as well (default is $HOME)
  • ‘--keep_logs‘ lets the logs remain after completion of the run

The command

./ddtransfer.sh --restart report_ddt_15337519335231.log

restarts a formally invoked run. The file report_ddt_$(date +%s).log will be created at the first call of the transfer.

Completing the transfer for data migration with ddtransfer.sh

After transferring the image files, the report file is processed and checked to see if the start and end checksums do not match or are even missing. The absence of checksums calculated after writing each block to the target device may occur if the connection between the source and target host has been interrupted or the temporarily used disk space filled up. If the checksum is missing or divergent, it is recalculated. If it still differs, the block is completely read out, transferred and recalculated. The dd calls are logged completely so that they can be repeated easily. If the checksum is still not correct, a corresponding error message will be shown.

In the event of error messages, the log files are retained. After checking the situation, the individual steps can be restarted manually if required.

Perspective and outlook

For future development, it must be further investigated how the individual program steps relate to each other and to the other parts with regard to multiple calls. This would hopefully improve the balancing of data migration and thus speed up the overall processing.

Besides, improvements could be

  • no transmission of blocks with the same checksum
  • a better status display
  • the dd option ‘conf=noerror,sync’
  • complete clearing of hanging rsync and ssh processes in case of a new call of ddtransfer.sh without restarting source and target computers
  • continue formerly started rsync jobs
  • the extension of the status file, among other things to improve the resumption of the transfer and
  • shorter file names in the temporary directories.

The Program Code

#!/bin/bash

# Georg Schieche-Dirik
# Script to transfer raw images using the dd and rsync commands
# via network.

# This script is work in progress. And as in general: use at your own risk!
# License GPL v. 3

if [ $# -le "1" ] ; then
cat <<-ENDOFMESSAGE
    Usage: $0 -l|--local_device 'source_device' -r|--remote_device 'target_device' -H|--TargetHost 'target_host'
    (-R|--remote_transfer_dir 'remote_transfer_directory' (default is '/homedir'))
    (-T|--local_transfer_dir 'transfer_directory' (default is '/homedir'))
    (-p|--ssh_port 'ssh_port' (default is '22')) (-k|--keep_logs keep log files in transfer directory)
    (-t|--RsyncTimeout 'rsync_timeout' (default is '60' in seconds')) 
    (-m|--max-connections (max connections will be set to 16, default is 8)) 
    (-S|--restart 'formerly written report file' (any other given option will be overwritten))

    For rsync a running ssh-agent is necessary.
    Tested with ext4, ntfs, xfs, btrfs.
    It is recommended not to start any other dd execution on the local or target host while this programm is running.

    For sudo users like Ubuntu, do:
    1) 'sudo su -'
    2) 'eval \`ssh-agent -s\`'
    3) 'ssh-add /home/user/.ssh/id_rsa'
    Now you can start the dd-transfer script.

    Example:

    $0 -l /dev/vdd -r /dev/vdd -H 46.16.76.151 -R /mnt -k
    $0 --local_device /dev/vdd --remote_device /dev/vdd --TargetHost 46.16.76.151 --remote_transfer_dir /mnt --keep_logs

    This transfers the contents from the local storage volume /dev/vdd to the remote volume /dev/vdd on host 46.16.76.151.
    There in the directory /mnt the temporary image files are stored before they are send to the devise. The log files are kept.
ENDOFMESSAGE
exit
fi

MaxSSHConnections=100
KeepLogs=no
SSHPort=22 
ProcessNumber=8 # Number of transfer processes in parallel, change only if you know what you are doing!
TransferDir=${HOME}
RemoteTransferDir=${HOME}
AvailableSpaceMin=5275810816 # 5G
MinOperatingSystemSpace=$((${AvailableSpaceMin} / 5))
SectorSize=512 
RsyncTimeout=60
JobID=$$
UsedOptions=$@
Cores=$(grep -c processor /proc/cpuinfo) ; if [ ${Cores} -eq 1 ] ; then Cores=2 ; fi

if ! ssh-add -l 2>&1 ; then
    echo "A running SSH agent is necessary!"
    exit 2
fi

while test $# -gt 0 ; do
    case "$1" in
        -h|--help)
            $0 ;; 
        -S|--restart) shift; 
            GivenReportFile=$1
            shift ;;
        -l|--local_device) shift; 
            SourceDevice=$1
            shift ;;
        -r|--remote_device) shift;
            TargetDevice=$1
            shift ;;
        -m|--max-connections)
            ProcessNumber=16
            shift ;;
        -p|--ssh_port) shift;
            SSHPort=$1
            shift ;;
        -T|--local_transfer_dir) shift;
            TransferDir=$1
            shift ;;
        -R|--remote_transfer_dir) shift;
            RemoteTransferDir=$1
            shift ;;
        -H|--TargetHost) shift;
            TargetHost=$1
            shift ;;
        -t|--RsyncTimeout) shift;
            RsyncTimeout=$1
            shift ;;
        -k|--keep_logs) 
            KeepLogs=yes
            shift ;;
        *)  echo "ERROR: Missing correct option, try $0 to get help"
            exit 2 ;;
    esac
done

if [ ! ${GivenReportFile} ] ; then

    JobTime=$(date +%s)
    JobTimeFile=$(date --date="@$JobTime" +%Y-%m-%d-%H-%M-%S)
    Report=$(pwd)/report_ddt_${JobTime}${JobID}.log
    LogDir=${TransferDir}/DiskTransfer_${JobTime}_${JobID} 
    RemoteLogDir=${RemoteTransferDir}/DiskTransfer_${JobTime}_${JobID} 
    SSH_command="ssh -p $SSHPort root@${TargetHost}"
    SCP_command="scp -P $SSHPort"

    (
        echo "Invoked command is"
        echo "$0 ${UsedOptions}"
        echo "JobID=${JobID}"
        echo "JobTime=${JobTime}"
        echo "SourceDevice=${SourceDevice}"
        echo "TargetDevice=${TargetDevice}"
        echo "SSHPort=${SSHPort}"
        echo "TransferDir=${TransferDir}"
        echo "RemoteTransferDir=${RemoteTransferDir}"
        echo "TargetHost=${TargetHost}"
        echo "RsyncTimeout=${RsyncTimeout}"
        echo "KeepLogs=${KeepLogs}"
        echo "LogDir=${LogDir}"
        echo "RemoteLogDir=${RemoteLogDir}"
        echo "ProcessNumber=${ProcessNumber}"
        echo "SSH_command=\""${SSH_command}"\""
        echo "SCP_command=\""${SCP_command}"\""
        echo
    ) | tee ${Report}

else

    Report=${GivenReportFile}
    for FormerJobVariable in $(grep -P '^[A-Za-z_]*=' ${Report}) ; do
        eval $(grep -o -P -m 1 "^${FormerJobVariable}.*" ${Report})
    done

    JobTimeFile=$(date --date="@$JobTime" +%Y-%m-%d-%H-%M-%S)

    if [[ $(ps cax | grep rsync 2> /dev/null) ]] ; then (
        echo
        echo "ERROR: rsync processes are still running!"
        echo "They might be related to a former execution of $0."
        echo "Please wait until they are finished or stop them."
        echo ) | tee -a ${Report}
        exit 2
    elif $SSH_command "if ps cax | grep 'dd count=' 2> /dev/null" ; then (
        echo
        echo "ERROR: dd processes are still running on remote host!"
        echo "They might be related to a former execution of $0."
        echo "Please wait until they are finished or stop them."
        echo ) | tee -a ${Report}
        exit 2
    fi

    (   echo "Removing partly processed local files of former command run:"
        rm -v $LogDir/*img
        for i in $(ls $LogDir/*ongoing) ; do mv -v ${i} ${i%.ongoing} ; done
        echo

        echo "Removing partly processed remote files of former command run:"
        echo
        $SSH_command "find ${RemoteLogDir} -name \"*.ToDev\" | while read ; do mv -v \${REPLY} \${REPLY%.ToDev} ; done"
        echo
        $SSH_command "find ${RemoteLogDir} -name \"*.transfer.*\"  | while read ; do rm -v \${REPLY} ; done"
        echo

    ) | tee -a ${Report}

fi

if ! fdisk -l ${SourceDevice} 2>&1 > /dev/null ; then
    echo
    echo "ERROR: Read and write access to device ${SourceDevice} is crucial!"
    exit 2
elif ! ${SSH_command} "fdisk -l ${TargetDevice} 2>&1 > /dev/null" ; then
    echo
    echo "ERROR: Read and write access to remote device ${TargetDevice} is crucial!"
    exit 2
fi

Blocks=$(cat /sys/block/${SourceDevice##*/}/device/block/${SourceDevice##*/}/size 2>/dev/null)
if [[ "${Blocks}" == "" ]] ; then
    SourceDeviceRaw=${SourceDevice##*/} ; SourceDeviceRaw=${SourceDeviceRaw//[0-9]/}
    SourceDeviceNumber=${SourceDevice##*/}; SourceDeviceNumber=${SourceDeviceNumber//[a-z]/}
    Blocks=$(cat /sys/block/${SourceDeviceRaw}/device/block/${SourceDeviceRaw}/${SourceDeviceRaw}${SourceDeviceNumber}/size 2>/dev/null)
fi
if [[ "${Blocks}" == "" ]] ; then
    LVM=$(ls -l ${SourceDevice} | grep -P -o 'dm-.*')
    Blocks=$(cat /sys/block/${LVM}/size 2>/dev/null)
fi
if [[ "${Blocks}" == "" ]] ; then
    echo "Number of device blocks for ${SourcdDevice} could not be found! Exiting..."
    exit 2
fi

Rsync_command='rsync -e "'"ssh -p ${SSHPort}"'" --timeout='${RsyncTimeout}' --compress --compress-level=9'

StartBlock=1
Skip=0
Run=0
SectorPortion=$(( ${Blocks} / $(( ${ProcessNumber} * ${ProcessNumber} * 8 )) ))
while [[ ${SectorPortion} -gt $((${AvailableSpaceMin} / ${ProcessNumber})) ]] ; do 
    Run=$((${Run}+1))
    SectorPortion=$(( ${Blocks} / $((${ProcessNumber} * ${Run})) ))
done
BlockCount=${Cores}
Chunk=$((${SectorPortion} * ${SectorSize} / ${Cores}))
FileSize=$((${SectorPortion} * ${SectorSize}))
LastRun=$(( ${Blocks} / $SectorPortion ))
Iterations=( $(seq -w ${LastRun}) )

mkdir -p ${LogDir}

if [ ! ${GivenReportFile} ] ; then
    for i in $(echo ${Iterations[*]}) ; do 
        touch ${LogDir}/${JobTimeFile}_${i}.run
    done
elif [ ${GivenReportFile} ] ; then
    RemoteFilesInProcess=( $($SSH_command "ls ${RemoteLogDir}/*_${JobTimeFile}.* | grep -P -o 'DeviceCopy_[0-9]*_' | grep -P -o '[0-9]*'") )
    LocalFilesToProcess=( $(ls ${LogDir}/${JobTimeFile}*.run | grep -P -o '[0-9]{4}.run' | grep -P -o '[0-9]*') )
    for MissingDoneOrRun in $(echo ${Iterations[@]} ${LocalFilesToProcess[@]} ${RemoteFilesInProcess[@]} | tr ' ' '\n' | sort | uniq -u) ; do
        touch ${LogDir}/${JobTimeFile}_${MissingDoneOrRun}.run
    done
fi

GetWorkSpace="df -P -B1 $LogDir | grep /dev/ | tr -s ' ' | cut -d ' ' -f4"
GetRemoteWorkSpace="df -P -B1 $RemoteLogDir | grep /dev/ | tr -s ' ' | cut -d ' ' -f4"

function dd_command() {
    Direction=$1
    Device=$2
    File=$3

    if [ "$Direction" == "From" ] ; then 
        Input=$Device
        Output=$File
    elif [ "$Direction" == "To" ] ; then 
        Input=$File
        Output=$Device
    fi

    Param=( $(echo ${File//_/ }) )
    Step=${Param[3]}
    Chunk=${Param[4]}
    Count=${Param[5]}
    Skip=${Param[6]}
    Groth=0

    if [[ "${Output}" =~ ".img" ]] ; then

        echo "Copying into ${File}"
        echo "dd count=${Count} if=${Input} bs=${Chunk} skip=$Skip | tee ${Output} | sha256sum"
        echo

        CheckSum=$(dd count=${Count} if=${Input} bs=${Chunk} skip=$Skip | tee ${Output} | sha256sum)

        while [ ${Groth} -lt ${FileSize} ] ; do
            sleep 2
            Groth=$(ls -l ${File} | cut -f5 -d ' ')
        done

        echo "Checksum step ${Step} local is ${CheckSum% -}" | tee -a ${Report}
        mv ${TransferFile}.img ${TransferFile}.img.transfer

    elif [[ "${Output}" =~ "/dev/" ]] ; then

        echo "Writing to device ${File}"
        echo "$SSH_command \"dd count=${Count} if=${Input} of=${Output} bs=${Chunk} seek=${Skip}\""
        echo "and"
        echo "$SSH_command \"dd count=${Count} if=${Output} bs=${Chunk} skip=${Skip} | sha256sum\""
        echo

        $SSH_command "dd count=${Count} if=${Input} of=${Output} bs=${Chunk} seek=${Skip} 2>&1" 
        sleep 2

        CheckSum=$($SSH_command "dd count=${Count} if=${Output} bs=${Chunk} skip=${Skip} | sha256sum") 
        sleep 2

        echo "Checksum step ${Step} remote is ${CheckSum% -}" | tee -a ${Report}
        Done=${Input%.img.transfer.ToDev}.done

        StepCount=0
        while ! $SSH_command "mv -v ${Input} ${Done} ; echo 0 > ${Done}" && [[ ${StepCount} -lt 11 ]]; do
            echo "Attempt to mv ${Input} to be done failed! Retrying..."
            sleep 10
            StepCount=$((${StepCount}+1))
        done
        echo "${Input} done"

    fi
}

function RemoteWorkspace {
    CopyJobsAndSpace=( $($SSH_command "
        ls -a ${RemoteLogDir}/.DeviceCopy_* ${RemoteLogDir}/DeviceCopy_* 2> /dev/null | grep -v done | wc -l
        eval $GetRemoteWorkSpace
    ") )

    CopyJobs=${CopyJobsAndSpace[0]}
    WorkSpace=${CopyJobsAndSpace[1]}
    SizeOfRunningCopyJobs=$((${CopyJobs} * ${Chunk} * ${BlockCount}))        
    echo $((${WorkSpace} - ${SizeOfRunningCopyJobs} - ${MinOperatingSystemSpace}))
}

function CreateImageFiles {
    Run=0

    while [[ ${Iterations[$Run]} ]] ; do

        if [ ! -e ${LogDir}/${JobTimeFile}_${Iterations[${Run}]}.run ] ; then

            Run=$((${Run} + 1))
            Skip=$((${Skip} + ${BlockCount}))

        else

            DDJobs=$(ls ${LogDir}/DeviceCopy_*.img 2> /dev/null | wc -l) || DDJobs=0 
            SizeOfRunningDDJobs=$((${DDJobs} * ${Chunk} * ${BlockCount}))        
            WorkSpace=$(eval $GetWorkSpace)
            WorkSpace=$((${WorkSpace} - ${SizeOfRunningDDJobs} - ${MinOperatingSystemSpace}))

            if [ $WorkSpace -gt $AvailableSpaceMin ] && [ ${DDJobs} -lt ${Cores} ] ; then

                TransferFile=${LogDir}/DeviceCopy_${Iterations[$Run]}_${Chunk}_${BlockCount}_${Skip}_${JobTimeFile}
                (dd_command From "${SourceDevice}" ${TransferFile}.img 2>&1 >> ${LogDir}/CreateImageFiles.log &)
                while [ ! -e ${TransferFile}.img ] ; do
                    sleep 2
                done
                Skip=$((${Skip} + ${BlockCount}))
                StartBlock=$((${StartBlock}+${SectorPortion})) # ????
                Run=$((${Run} + 1))

            else

                sleep 5

            fi
        fi

    done
}

function Transfer() {
    File=$1
    mv -v ${File} ${File}.ongoing

    Image=${File##*/}
    echo Transfer start at $(date)
    TransferCommand="$Rsync_command ${File}.ongoing ${TargetHost}:${RemoteLogDir}/${Image}"
    TransferCommandExitStatus=1
    TransferCommandLeave=0
    until [ $TransferCommandExitStatus -eq 0 ] || [ $TransferCommandLeave -eq 10 ] ; do 
        echo "Transfering ${File}"
        echo ${TransferCommand}
        eval $TransferCommand
        TransferCommandExitStatus=$?
        TransferCommandLeave=$((${TransferCommandLeave} + 1))
        sleep 3
    done
    echo "Rsync command exit status: ${TransferCommandExitStatus}"
    echo Transfer end at $(date)

    rm ${File}.ongoing
}

function SendFiles {
    sleep 5
    $SSH_command "[ -e ${RemoteLogDir} ] || mkdir ${RemoteLogDir}"

    while (ls ${LogDir}/${JobTimeFile}_*.run 2>&1 /dev/null) || (ps cax | grep "dd count") || (ls ${LogDir}/*_${JobTimeFile}.img.transfer 2> /dev/null); do
        for TransferFile in $(ls ${LogDir}/*_${JobTimeFile}.img.transfer 2> /dev/null | head -n 1) ; do
            while [ "$(ps ax | grep -c 'rsync -e')" -gt "${ProcessNumber}" ] ; do
                sleep 5
            done
            RunState=${TransferFile#*Copy_} ; RunState=${RunState%%_*}

            while [[ $(/bin/netstat -ntp | grep -v TIME_WAIT | grep -c ${TargetHost}:22) -gt ${MaxSSHConnections} ]] ; do 
                echo -n "To many SSH connetions!" ; /bin/netstat -ntp | grep -c "${TargetHost}:22"
                sleep 10
            done

        while [ $(RemoteWorkspace) -lt $AvailableSpaceMin ] \
            || [[ $(ps ax | grep ssh) -gt ${MaxSSHConnections} ]] 
        do
            sleep 7
        done

            (Transfer $TransferFile 2>&1 >> ${LogDir}/SendFiles.log &)

            TMP=${LogDir}/${JobTimeFile}_${RunState}.run ; if [ ${TMP} ] ; then rm -v ${TMP} ; fi
            #sleep 1
        done
        sleep 3
    done
}

function RemoteStatusImageToTargetDevice {

    export RemoteStatus=${RemoteLogDir}/RemoteStatusImageToTargetDevice.sh
    TmpStatusFile=${LogDir}/RemoteStatusImageToTargetDevice.sh
    $SSH_command "[ -e ${RemoteLogDir} ] || mkdir ${RemoteLogDir}"

    echo "
        one=\$(ls ${RemoteLogDir}/*_${JobTimeFile}*img.transfer 2> /dev/null | head -n 1)
        two=\$(ps ax | grep -c 'dd count=')
        three=\$(ls ${RemoteLogDir}/*_${JobTimeFile}.done 2> /dev/null | wc -l)
        echo \$one \$two \$three
    " > ${TmpStatusFile}

    chmod a+x ${TmpStatusFile}
    $SCP_command ${TmpStatusFile} root@${TargetHost}:${RemoteStatus}
}

function ImageToTargetDevice {
    sleep 8 
    TargetCoreNumber=$($SSH_command "grep -c processor /proc/cpuinfo")

    while ! grep -P "Checksum step 0*1 local" ${Report} > /dev/null ; do
        sleep 5
    done

    RemoteStatusImageToTargetDevice

    while ls ${LogDir}/*_${JobTimeFile}.img* 2> /dev/null || $SSH_command "ls ${RemoteLogDir}/*_${JobTimeFile}.img.transfer 2> /dev/null" ; do

        while
            FileAndProcessNumber=( $($SSH_command "bash ${RemoteStatus}") ) && \
                    [[ "${FileAndProcessNumber[1]}" != "${LastRun}" ]] && \
                [[ $(ps ax | grep -c ssh) -lt ${MaxSSHConnections} ]]
        do 
            echo  FileAndProcessNumber ${FileAndProcessNumber[*]} 
            if [[ ${FileAndProcessNumber[0]} =~ "img.transfer" ]] && \
               [[ ${FileAndProcessNumber[1]} -lt $((${TargetCoreNumber}+1)) ]] && \
               [[ $($SSH_command "ls ${RemoteLogDir}/*.ToDev 2>/dev/null | wc -l") -lt $((${TargetCoreNumber}+1)) ]] ;
            then
                ImagePart=${FileAndProcessNumber[0]}.ToDev
                $SSH_command "mv -v ${FileAndProcessNumber[0]} ${ImagePart}"

                (dd_command To "${TargetDevice}" ${ImagePart} 2>&1 >> ${LogDir}/ToDeviceImageFiles.log &)

            fi
            sleep 1 
        done
        sleep 1

    done
}

function ReImage {
    Part=$1
    LocalChecksum=$(tac ${Report} | grep -m 1 -P "Checksum step ${Part} local is [a-z0-9]{64}" | cut -d ' ' -f6)
    Jump=$(grep -m 1 -o -P "Copy_${Part}_.*ToDev$" ${LogDir}/ToDeviceImageFiles.log | cut -d '_' -f 5)
    Check=$(grep -P "^ssh -p ${SSHPort} .*bs=${Chunk} skip=${Jump} " ${LogDir}/ToDeviceImageFiles.log)
    RemoteChecksum=$(eval ${Check} 2>/dev/null)
    RemoteChecksum=$(echo ${RemoteChecksum} | sed 's/ \-//')

    if [[ "${LocalChecksum}" == "${RemoteChecksum}" ]] ; then
        echo "Checksum step ${Part} remote is ${RemoteChecksum% -}" | tee -a ${Report}
        echo "Checksum step ${Part} is ok on both sides." | tee -a ${Report}
    else
        echo "WARNING: Checksum step ${Part} differs. Transfering mentioned part of the volume again."
        GetImagingAgain=$(grep "skip=${Jump} " ${LogDir}/CreateImageFiles.log | grep -v '^+')
        GetImagingAgain=$(echo ${GetImagingAgain} | sed 's/.img /.img.transfer.ongoing /')
        Transfer=$(grep -P "^rsync.*DeviceCopy_${Part}_" ${LogDir}/SendFiles.log)
        Write=$(grep -P "^ssh -p ${SSHPort} .*bs=${Chunk} seek=${Jump}\"" ${LogDir}/ToDeviceImageFiles.log)
        Write=$(echo ${Write} | sed 's/.transfer.ToDev/.transfer/')
        Check=$(grep -P "^ssh -p ${SSHPort} .*bs=${Chunk} skip=${Jump} " ${LogDir}/ToDeviceImageFiles.log)
        LocalChecksum=$(eval ${GetImagingAgain} 2>/dev/null)
        echo "Checksum step ${Part} local is ${LocalChecksum% -}" | tee -a ${Report}
        eval ${Transfer}
        eval ${Write} 2>/dev/null
        RemoteChecksum=$(eval ${Check} 2>/dev/null)
        if [[ "${LocalChecksum% -}" == "${RemoteChecksum% -}" ]] ; then
            echo "Checksum step ${Part} remote is ${RemoteChecksum% -}" | tee -a ${Report}
            echo "Checksum step ${Part} is ok on both sides." | tee -a ${Report}
        else
            echo "ERROR: Checksum step ${Part} still differs." | tee -a ${Report}
            ErrorSum=$(( ${ErrorSum} + 1 ))
        fi  
    fi  
    export ErrorSum=${ErrorSum}
}

function ShowProceeding {

    LocalProcess=true
    RemoteProcess=true
    sleep 5

    while [[ "${LocalProcess}" == "true" || "${RemoteProcess}" == "true" ]] ; do

    LocalFiles=$(ls -lh ${LogDir}/*_${JobTimeFile}.img* ${LogDir}/*.run 2> /dev/null)
        if [[ "${LocalFiles}" == "" ]] ; then
            LocalProcess=false
            (echo "Last file send to remote device") | tee -a ${Report}
        else
            echo "Local files in progress:"
            echo
            echo ${LocalFiles} | tr ' ' '\n' | grep transfer | head -n 15
            echo
        fi

    RemoteFiles=$($SSH_command "ls -a ${RemoteLogDir}/*_${JobTimeFile}.* ${RemoteLogDir}/.*_${JobTimeFile}.* 2> /dev/null") 
        if [[ "$(echo ${RemoteFiles} | tr ' ' '\n' | grep -c .done)" == "${LastRun}" ]] ; then
            RemoteProcess=false
        else
            echo "Remote files in progress:"
            echo
            echo ${RemoteFiles} | tr ' ' '\n' | grep img | head -n 15
            echo
        fi
    sleep 4

    done

    echo "Comparing data checksum results..."

    for i in $(seq -w ${LastRun}) ; do
        Local=$(grep -P "Checksum step ${i} local is " ${Report} | tail -n 1 | cut -d ' ' -f6)
        Remote=$(grep -P "Checksum step ${i} remote is " ${Report} | tail -n 1 | cut -d ' ' -f6)
        if [[ "${Local}" != "${Remote}" ]] || [[ "${Remote}" == "" ]] ; then
            echo "WARNING: Checksum for local and remote step ${i} is not identical!" | tee -a ${Report}
            echo "Attempt to fix will be initiated."
        fi
    done

    ErrorSum=0
    for CheckAgain in $(grep "is not identical" ${Report} | cut -d ' ' -f8 | sort | uniq) ; do
        ReImage ${CheckAgain};
    done

    echo "Job finished, ${ErrorSum} errors reported!" | tee -a ${Report}

    if [[ "${KeepLogs}" == "no" ]] && [[ ${ErrorSum} -eq 0 ]] ; then
        rm -rf ${LogDir}
        $SSH_command "rm -rf ${RemoteLogDir}"
    else
        echo "Check local ${LogDir} and remote ${RemoteLogDir} directory for details."
    fi
}

echo Start from block ${StartBlock} to block ${Blocks} with ${SectorPortion} blocks at $JobTimeFile

CreateImageFiles 2>&1 >> ${LogDir}/CreateImageFiles.log &

SendFiles 2>&1 >> ${LogDir}/SendFiles.log &

ImageToTargetDevice 2>&1 >> ${LogDir}/ToDeviceImageFiles.log &

ShowProceeding

 
  • I used to migrate big data from one place to another and syncing it to other databases as well and boy does it took a month of my life off while doing so. Recently i was handed a task from someone write my essay for me works to migrate their data as well(an external project) to their new platform. There data wasn't that much but it required at least 4 days but this syncing code which you have put up can reduce them to 2 or 3 days i guess.

Log In, Add a Comment