Abstract: In the era of data duplication, the document checksum approach is commonly used to quickly and accurately identify redundant information. The calculated value of numbers and letters used to ...