Abstract: Secure data deduplication can significantly reducethe communication and storage overheads in cloud storage services, and haspotential applications in our big data driven society. Existing data deduplicationschemes are generally designed to either resist brute-force attacks or ensurethe efficiency and data availability, but not both conditions. This system isalso not aware of any existing scheme that achieves accountability, in thesense of reducing duplicate information disclosure (e.g., to determine whetherplaintexts of two encrypted messages are identical). In this seminar work,investigates a three-tier cross-domain architecture, and presents an efficientand privacy-preserving big data deduplication in cloud storage (hereafter referredto as EPCDD). EPCDD achieves both privacy-preserving and data availability, andresists brute-force attacks. In addition, It take accountability intoconsideration to offer better privacy assurances than existing schemes. Then itdemonstrate that EPCDD outperforms existing competing schemes, in terms ofcomputation, communication and storage overheads. In addition, the timecomplexity of duplicate search in EPCDD is logarithmic.
Keywords: Secure Data De-duplication, EncryptedData, Data Availability, accountability, Cloud Computing.