Author Topic: [MOD] Check for duplicate images v1.0  (Read 58205 times)

0 Members and 1 Guest are viewing this topic.

Offline V@no

  • If you don't tell me what to do, I won't tell you where you should go :)
  • Global Moderator
  • 4images Guru
  • *****
  • Posts: 17.849
  • mmm PHP...
    • View Profile
    • 4images MODs Demo
[MOD] Check for duplicate images v1.0
« on: September 10, 2006, 02:50:33 AM »
This is a very simple check if a uploaded file was previously uploaded.
The method used in this mod is to generate MD5 hash for each image in the database and then compare them with MD5 hash of newly uploaded files.
The name, filename and extension is ignored by this mod, the contents of the file is what used to generate MD5 hash. Though this method has a weak spot: if a file was altered in any way the MD5 hash will be different from the original file.


----------[ Installation ]------------

Step 1
Download this package. (File attached on bottom of this post too)
Extract it and upload image_md5_hash.php file into admin/plugins/ folder (if no such folder exists then create it)


Step 2
Open member.php
Find:
Code: [Select]
  if (!$error) {
    // Start Upload
    include(ROOT_PATH.'includes/upload.php');
Insert above:
Code: [Select]
/*
  MOD CHECK FOR DUPLICATE IMAGES
  START INSERT
*/
##########
# CONFIG #
##########

  $check_admin = true; //do check when administrator is uploading? (true/false)
  $show_image = true; //show link to the image that was previously uploaded? (true/false)
  $show_member = true; //show name and link to profile page of the member who previously uploaded that file? (true/false)

##############
# END CONFIG #
##############

  $md5 = "";
  unset($HTTP_POST_VARS['image_md5']);
  if ($user_info['user_level'] != ADMIN || $check_admin)
  {
    if (!empty($HTTP_POST_FILES['media_file']['tmp_name']) && $HTTP_POST_FILES['media_file']['tmp_name'] != "none")
    {
      $md5 = md5_file($HTTP_POST_FILES['media_file']['tmp_name']);
      $file = $HTTP_POST_FILES['media_file']['filename'];
    }
    elseif ($remote_media_file)
    {
      $md5 = md5($remote_media_file);
      $file = $remote_media_file;
    }
    if ($md5)
    {
      $sql = "SELECT image_id, image_name, cat_id, user_id
              FROM ".IMAGES_TABLE."
              WHERE image_md5 = '".$md5."'
              LIMIT 1";
      if ($row = $site_db->query_firstrow($sql))
      {

        $row['image_name'] = stripslashes($row['image_name']);
        if (function_exists('multilang')) $row['image_name'] = multilang($row['image_name']);
        $user_row = get_user_info($row['user_id']);
//        $msg .= (($msg != "") ? "<br />" : "")."<b>".$lang['file_upload_error'].": ".$file."</b><br />";
        $msg .= (($msg != "") ? "<br />" : "").(($user_info['user_level'] > GUEST && $user_info['user_id'] == $user_row['user_id']) ? $lang['image_md5_duplicate_self'] : sprintf(($show_member ? $lang['image_md5_duplicate_more'] : $lang['image_md5_duplicate_simple']), "<a href=\"".$site_sess->url(ROOT_PATH."member.php?action=showprofile&".URL_USER_ID."=".$user_row['user_id'])."\">".$user_row['user_name']."</a>"));
        if ($show_image && (($user_info['user_level'] > GUEST && $user_info['user_id'] != $user_row['user_id']) || (check_permission("auth_viewcat", $row['cat_id'] && check_permission("auth_viewimage", $row['cat_id'])))))
        {
          $msg .= ": <a href=\"".$site_sess->url(ROOT_PATH."details.php?image_id=".$row['image_id'])."\">".$row['image_name']."</a>";
        }
        $error = 1;
      }
      else
      {
        $sql = "SELECT image_id, image_name, user_id
                FROM ".IMAGES_TEMP_TABLE."
                WHERE image_md5 = '".$md5."'
                LIMIT 1";
        if ($row = $site_db->query_firstrow($sql))
        {
          $user_row = get_user_info($row['user_id']);
//          $msg .= (($msg != "") ? "<br />" : "")."<b>".$lang['file_upload_error'].": ".$file."</b><br />";
          $msg .= (($msg != "") ? "<br />" : "").(($user_info['user_level'] > GUEST && $user_info['user_id'] == $row['user_id']) ? $lang['image_md5_duplicate_validation_self'] : sprintf(($show_member ? $lang['image_md5_duplicate_validation_more'] : $lang['image_md5_duplicate_validation_simple']), "<a href=\"".$site_sess->url(ROOT_PATH."member.php?action=showprofile&".URL_USER_ID."=".$user_row['user_id'])."\">".$user_row['user_name']."</a>"));
          $error = 1;
        }
      }
      $HTTP_POST_VARS['image_md5'] = $md5;
    }
  }
/*
  MOD CHECK FOR DUPLICATE IMAGES
  END INSERT
*/


Step 3
Open lang/<your language>/main.php
At the end, above closing ?> insert:

:flag-en: English:
Code: [Select]
$lang['image_md5'] = "Image MD5 hash";
$lang['image_md5_duplicate_self'] = "You have submitted this file before";
$lang['image_md5_duplicate_more'] = "This file has been previously submitted by %s";
$lang['image_md5_duplicate_simple'] = "This file has been previously submitted";
$lang['image_md5_duplicate_validation_self'] = "You have submitted this file before and awaiting validation.";
$lang['image_md5_duplicate_validation_more'] = "This file has been previously submitted by %s and awaiting validation.";
$lang['image_md5_duplicate_validation_simple'] = "This file has been previously submitted and awaiting validation.";

:flag-ru: Russian:
Code: [Select]
$lang['image_md5'] = "MD5 хэш файла";
$lang['image_md5_duplicate_self'] = "Вы уже раньше заливали этот файл";
$lang['image_md5_duplicate_more'] = "%s уже залили этот файл до вас";
$lang['image_md5_duplicate_simple'] = "Этот файл кто-то уже залил до вас";
$lang['image_md5_duplicate_validation_self'] = "Вы уже залили этот файл, и он ожидает подтверждения от администрации";
$lang['image_md5_duplicate_validation_more'] = "%s уже залили этот файл, и файл ожидает подтверждения от администрации";
$lang['image_md5_duplicate_validation_simple'] = "Кто-то уже залил этот файл, и файл ожидает подтверждения от администрации";


Step 4
Open includes/db_field_definitions.php
At the end, above closing ?> insert:
Code: [Select]
$additional_image_fields['image_md5'] = array($lang['image_md5'], "text", 0);




After installation in ACP (Admin Control Panel) in "Plugins" part of menu you should see new link "Image MD5 hash update".
When you click on it for the first time, the database installation process will start.
After new fields in the databse were added, you can again click on the "Image MD5 hash update" link and generate MD5 hash for all images in your gallery.

NOTE:
MD5 hash for remote images generates from the URL and not from the image content, so, if image was changed on the remote server, MD5 will not change.

P.S.
In case the download link doesnt work, I've attached the package, but I can not guarantee, that the attached package will be up-to-date in the future.

P.P.S.
This mod was created for 4images v1.7.3 but should work on v1.7 as well :)
« Last Edit: April 24, 2008, 11:35:01 PM by Nicky »
Your first three "must do" before you ask a question:
Please do not PM me asking for help unless you've been specifically asked to do so. Such PMs will be deleted without answer. (forum rule #6)
Extension for Firefox/Thunderbird: Master Password+    Back/Forward History Tweaks (restartless)    Cookies Manager+    Fit Images (restartless for Thunderbird)

Offline impss

  • Sr. Member
  • ****
  • Posts: 382
    • View Profile
    • Cusstom.net
Re: [MOD] Check for duplicate images v1.0
« Reply #1 on: September 10, 2006, 10:11:50 AM »
Thanks V@no ,


I have been needing something like this for awhile..

And it seems to be working good..

 :D

Offline JensF

  • Addicted member
  • ******
  • Posts: 1.028
    • View Profile
    • http://www.terraristik-galerie.de
Re: [MOD] Check for duplicate images v1.0
« Reply #2 on: September 10, 2006, 07:17:22 PM »
Hi,

is this Mod working when i say in ACP Upload Modus "save image with new name"????


Mit freundlichem Gruß
Jens Funk



-> Sorry for my bad English <-

Offline V@no

  • If you don't tell me what to do, I won't tell you where you should go :)
  • Global Moderator
  • 4images Guru
  • *****
  • Posts: 17.849
  • mmm PHP...
    • View Profile
    • 4images MODs Demo
Re: [MOD] Check for duplicate images v1.0
« Reply #3 on: September 10, 2006, 07:40:28 PM »
yes, the filename is irrelevant for this mod.
Your first three "must do" before you ask a question:
Please do not PM me asking for help unless you've been specifically asked to do so. Such PMs will be deleted without answer. (forum rule #6)
Extension for Firefox/Thunderbird: Master Password+    Back/Forward History Tweaks (restartless)    Cookies Manager+    Fit Images (restartless for Thunderbird)

Offline JensF

  • Addicted member
  • ******
  • Posts: 1.028
    • View Profile
    • http://www.terraristik-galerie.de
Re: [MOD] Check for duplicate images v1.0
« Reply #4 on: September 10, 2006, 08:01:17 PM »
OK, i will test it but i have a respect for this mod.

you say

Quote
If your images/files were resized, watermarked or altered in any way after they were uploaded, the MD5 hash will not be accurate, therefore will allow upload again the original files


My images are watermarked.....can anyone say me in german the problem???
Mit freundlichem Gruß
Jens Funk



-> Sorry for my bad English <-

Offline antonio2005

  • Newbie
  • *
  • Posts: 11
    • View Profile
Re: [MOD] Check for duplicate images v1.0
« Reply #5 on: September 11, 2006, 01:07:55 AM »
Great mod!!!!

Thanks a lot.

Simple but yet effective.

A fantastique piece of code.

Offline antonio2005

  • Newbie
  • *
  • Posts: 11
    • View Profile
Re: [MOD] Check for duplicate images v1.0
« Reply #6 on: September 19, 2006, 12:23:02 PM »
Hi Vano,

I´ve been using this mod since you posted it, and as i saud, it works great.

I´ld like to know if it is dificult to you to add another feature...  i explain:
If you have time... :lol:

When i instaled the mod, it generated the MD5 records for all files in the server... Perfect!
When i try to add an existant file, the mod doesnt allow..  Perfect!

What about the files i already had when i first run the script? is there any way to see if there is duplicate entries in the old files?
Is it dificult to add a menu like "Files Check" that will look at the whole database, and show only files with duplicate MD5 records, giving the possibility do delete one of them?

Thanks in advance,
António




Offline V@no

  • If you don't tell me what to do, I won't tell you where you should go :)
  • Global Moderator
  • 4images Guru
  • *****
  • Posts: 17.849
  • mmm PHP...
    • View Profile
    • 4images MODs Demo
Re: [MOD] Check for duplicate images v1.0
« Reply #7 on: September 19, 2006, 02:20:25 PM »
Yes, I had thought about it, but since the MD5 hash is not indexed in the database, it creates too much load on the database (at least with the queries I've been testing it)...
anyways, untill I get my site back up this not gonna happend...
Your first three "must do" before you ask a question:
Please do not PM me asking for help unless you've been specifically asked to do so. Such PMs will be deleted without answer. (forum rule #6)
Extension for Firefox/Thunderbird: Master Password+    Back/Forward History Tweaks (restartless)    Cookies Manager+    Fit Images (restartless for Thunderbird)

Offline babe

  • Jr. Member
  • **
  • Posts: 63
    • View Profile
Re: [MOD] Check for duplicate images v1.0
« Reply #8 on: October 05, 2006, 11:17:56 AM »
Did anyone test the MD5 hash generation on a database with over 25k images? I wonder how long it would take and if it would succesfully complete with such a large collection.

Offline impss

  • Sr. Member
  • ****
  • Posts: 382
    • View Profile
    • Cusstom.net
Re: [MOD] Check for duplicate images v1.0
« Reply #9 on: October 05, 2006, 03:14:23 PM »
Did anyone test the MD5 hash generation on a database with over 25k images? I wonder how long it would take and if it would succesfully complete with such a large collection.

You tell it how many images to do at once, and it does it in steps.. and it continues by itself
I did 15k pictures at 100  pics at at time .. I didnt have any problems

Offline babe

  • Jr. Member
  • **
  • Posts: 63
    • View Profile
Re: [MOD] Check for duplicate images v1.0
« Reply #10 on: October 17, 2006, 06:27:40 PM »
Did anyone test the MD5 hash generation on a database with over 25k images? I wonder how long it would take and if it would succesfully complete with such a large collection.

You tell it how many images to do at once, and it does it in steps.. and it continues by itself
I did 15k pictures at 100  pics at at time .. I didnt have any problems

Just perfect.

Hey, thanks for telling me that!

Offline theolbap

  • Full Member
  • ***
  • Posts: 118
  • Search Google "AH"
    • View Profile
Re: [MOD] Check for duplicate images v1.0
« Reply #11 on: November 30, 2006, 10:31:53 PM »
Help!!

in form upload, photo is up. but appears

Warning: md5_file() [function.md5-file]: open_basedir restriction in effect. File(/tmp/phpzCUGaD) is not within the allowed path(s): (/www/docs/ahmira.com.ar/) in /www/docs/ahmira.com.ar/public_html/member.php on line 586

Warning: md5_file(/tmp/phpzCUGaD) [function.md5-file]: failed to open stream: Operation not permitted in /www/docs/ahmira.com.ar/public_html/member.php on line 586


i have v1.7.3, thx



Offline theolbap

  • Full Member
  • ***
  • Posts: 118
  • Search Google "AH"
    • View Profile
Re: [MOD] Check for duplicate images v1.0
« Reply #12 on: December 18, 2006, 07:45:41 PM »
Help?

Offline WeZ

  • Jr. Member
  • **
  • Posts: 72
    • View Profile
Re: [MOD] Check for duplicate images v1.0
« Reply #13 on: January 11, 2007, 01:04:24 PM »
Hi V@no,

I Trust you are well.

My site relies mostly on video within 4Images. My file sizes range from 1MB to over 200 or even 300MB files...

would it affect performance if this mod is applied to my site? when it generates a checksum, does it need to stream the entire file before it can generate it?

Kind Regards,
WeZ

Offline WeZ

  • Jr. Member
  • **
  • Posts: 72
    • View Profile
Re: [MOD] Check for duplicate images v1.0
« Reply #14 on: January 12, 2007, 09:44:02 AM »
Hi V@no & The Rest...

How do i get this mod to integrate with "[Plugin] Batch Import" and "[MOD] multiupload". ?

Do i post this request on this topic or the other two?

Ciao
WeZ