Powered by Invision Power Board

Welcome Guest ( Log In | Register )

2 Pages  1 2 > 
Reply to this topicStart new topicStart Poll

Outline · [ Standard ] · Linear+

> Hash files, Question by calculation

Girder
Jan 19 2004, 12:50
+Quote Post





Group: Members
Posts: 12
Joined: 28-October 03



How pays off Hash a file and its parts?

Very much I ask to show as formulas (Or a code on Delphi).

Beforehand thanks.
User is offlineProfile CardPM
Go to the top of the page
Post #1
Fuxie - DK
Jan 19 2004, 13:05
+Quote Post





Group: Managers
Posts: 4800
Joined: 21-January 03
From: Copenhagen, Denmark



Can you please explain what you mean?

It makes no sense to me at all...
User is offlineProfile CardPM
Go to the top of the page
Post #2
reanimated838uk
Jan 19 2004, 13:10
+Quote Post


Nutcracker


Group: Betatesters
Posts: 5198
Joined: 1-March 03
From: UK



He wants to know how the hash id for files are calculated so he can show it as a formula or code it in it.

I think it uses MD4 Hashing, though not sure how it does this. I think emule-official has it in its FAQ/Help, but im not sure.
User is offlineProfile CardPM
Go to the top of the page
Post #3
Fuxie - DK
Jan 19 2004, 13:26
+Quote Post





Group: Managers
Posts: 4800
Joined: 21-January 03
From: Copenhagen, Denmark



Well... That wasn't very clear.. laugh.gif
User is offlineProfile CardPM
Go to the top of the page
Post #4
Girder
Jan 19 2004, 13:27
+Quote Post





Group: Members
Posts: 12
Joined: 28-October 03



QUOTE
He wants to know how the hash id for files are calculated so he can show it as a formula or code it in it

- You are right.
If briefly: the algorithm of calculation Interests.

Everywhere searched, climbed in source codes. With С++ badly I understand.

Can experts from a forum will help.
User is offlineProfile CardPM
Go to the top of the page
Post #5
morevit
Jan 19 2004, 13:43
+Quote Post





Group: Retired Devs
Posts: 1066
Joined: 10-September 03
From: US East Coast



Here it is...

CODE

///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
// CreateHashFromInput() generates a hash from the next 'Length' bytes of one of 'file', 'file2', or 'in_string'
//  (the other two must be NULL (UGLY UGLY UGLY)). The hash is returned in '*Output'.
void CKnownFile::CreateHashFromInput(FILE* file,CFile* file2, int Length, uchar* Output, uchar* in_string)
{
EMULE_TRY

// time critial
uint32 Hash[4];

Hash[0] = 0x67452301;
Hash[1] = 0xEFCDAB89;
Hash[2] = 0x98BADCFE;
Hash[3] = 0x10325476;

CFile* data = 0;

if (in_string)
 data = new CMemFile(in_string,Length);

uint32 Required = Length;
uchar   X[64*128];

while (Required >= 64)
{
 uint32 len = Required & ~63;

 if (len > sizeof(X))
  len = sizeof(X);
 if (in_string)
  data->Read(&X,len);
 else if (file)
  fread(&X,len,1,file);
 else if (file2)
  file2->Read(&X,len);
 uint32 i = 0;
 do
 {
  MD4Transform(Hash, (uint32*)(X + i));
  i += 64;
 } while(i < len);
 Required -= len;
}
// bytes to read
if (Required != 0)
{
 if (in_string)
  data->Read(&X,Required);
 else if (file)
  fread(&X,Required,1,file);
 else if (file2)
  file2->Read(&X,Required);
}
// in byte scale 512 = 64, 448 = 56
X[Required++] = 0x80;
if (Required > 56)
{
 memset2(&X[Required], 0, 64 - Required);
 MD4Transform(Hash, (uint32*)X);
 Required = 0;
}
memset2(&X[Required], 0, 56 - Required);
// add size (convert to bits)
uint32 Length2[2] = { Length << 3, (uint32)Length >> 29 };
memcpy2(&X[56], Length2, 8);
MD4Transform(Hash, (uint32*)X);
md4cpy(Output, Hash);
safe_delete(data);

EMULE_CATCH
}
/////////////////////////////////////////////////////////////////////////////////////////////

// partial transformations
#define MD4_FF(a, b, c, d, x, s) \
{ \
 (a) += MD4_F((b), (c), (d)) + (x); \
 (a) = MD4_ROTATE_LEFT((a), (s)); \
}

#define MD4_GG(a, b, c, d, x, s) \
{ \
 (a) += MD4_G((b), (c), (d)) + (x) + (uint32)0x5A827999; \
 (a) = MD4_ROTATE_LEFT((a), (s)); \
}

#define MD4_HH(a, b, c, d, x, s) \
{ \
 (a) += MD4_H((b), (c), (d)) + (x) + (uint32)0x6ED9EBA1; \
 (a) = MD4_ROTATE_LEFT((a), (s)); \
}

/////////////////////////////////////////////////////////////////////////////////////////////
static void MD4Transform(uint32 Hash[4], uint32 x[16])
{
EMULE_TRY

uint32 a = Hash[0];
uint32 b = Hash[1];
uint32 c = Hash[2];
uint32 d = Hash[3];

/* Round 1 */
MD4_FF(a, b, c, d, x[ 0], S11); // 01
MD4_FF(d, a, b, c, x[ 1], S12); // 02
MD4_FF(c, d, a, b, x[ 2], S13); // 03
MD4_FF(b, c, d, a, x[ 3], S14); // 04
MD4_FF(a, b, c, d, x[ 4], S11); // 05
MD4_FF(d, a, b, c, x[ 5], S12); // 06
MD4_FF(c, d, a, b, x[ 6], S13); // 07
MD4_FF(b, c, d, a, x[ 7], S14); // 08
MD4_FF(a, b, c, d, x[ 8], S11); // 09
MD4_FF(d, a, b, c, x[ 9], S12); // 10
MD4_FF(c, d, a, b, x[10], S13); // 11
MD4_FF(b, c, d, a, x[11], S14); // 12
MD4_FF(a, b, c, d, x[12], S11); // 13
MD4_FF(d, a, b, c, x[13], S12); // 14
MD4_FF(c, d, a, b, x[14], S13); // 15
MD4_FF(b, c, d, a, x[15], S14); // 16

/* Round 2 */
MD4_GG(a, b, c, d, x[ 0], S21); // 17
MD4_GG(d, a, b, c, x[ 4], S22); // 18
MD4_GG(c, d, a, b, x[ 8], S23); // 19
MD4_GG(b, c, d, a, x[12], S24); // 20
MD4_GG(a, b, c, d, x[ 1], S21); // 21
MD4_GG(d, a, b, c, x[ 5], S22); // 22
MD4_GG(c, d, a, b, x[ 9], S23); // 23
MD4_GG(b, c, d, a, x[13], S24); // 24
MD4_GG(a, b, c, d, x[ 2], S21); // 25
MD4_GG(d, a, b, c, x[ 6], S22); // 26
MD4_GG(c, d, a, b, x[10], S23); // 27
MD4_GG(b, c, d, a, x[14], S24); // 28
MD4_GG(a, b, c, d, x[ 3], S21); // 29
MD4_GG(d, a, b, c, x[ 7], S22); // 30
MD4_GG(c, d, a, b, x[11], S23); // 31
MD4_GG(b, c, d, a, x[15], S24); // 32

/* Round 3 */
MD4_HH(a, b, c, d, x[ 0], S31); // 33
MD4_HH(d, a, b, c, x[ 8], S32); // 34
MD4_HH(c, d, a, b, x[ 4], S33); // 35
MD4_HH(b, c, d, a, x[12], S34); // 36
MD4_HH(a, b, c, d, x[ 2], S31); // 37
MD4_HH(d, a, b, c, x[10], S32); // 38
MD4_HH(c, d, a, b, x[ 6], S33); // 39
MD4_HH(b, c, d, a, x[14], S34); // 40
MD4_HH(a, b, c, d, x[ 1], S31); // 41
MD4_HH(d, a, b, c, x[ 9], S32); // 42
MD4_HH(c, d, a, b, x[ 5], S33); // 43
MD4_HH(b, c, d, a, x[13], S34); // 44
MD4_HH(a, b, c, d, x[ 3], S31); // 45
MD4_HH(d, a, b, c, x[11], S32); // 46
MD4_HH(c, d, a, b, x[ 7], S33); // 47
MD4_HH(b, c, d, a, x[15], S34); // 48

Hash[0] += a;
Hash[1] += b;
Hash[2] += c;
Hash[3] += d;

EMULE_CATCH
}
//////////////////////////////////////////////////////////////////////////////////////////


I don't pretend to understand completely what's going on, but essentially, the hash is initialized to some magic value and then the data is eaten 64 bytes at a time, put through the MD4 transform, and added to the hash.

The MD4 transform is the actual hash. It tries to create a reasonably unique 16 byte value from the next 64 bytes of data.

There's a little complication at the end because the data isn't usually an even multiple of 64 bytes long.

Hope that helps.
User is offlineProfile CardPM
Go to the top of the page
Post #6
moosetea
Jan 19 2004, 13:54
+Quote Post





Group: Retired Devs
Posts: 803
Joined: 7-February 03



It uses MD4 hashing.....

Each Part (9mb or so) is hashed using md4 to produce the part hash, then all the part hashes are hashed together to produce the overall filehash. The filehash is contained in the e2k link, all the hashes of the parts (known as a hashset) is transfered p2p between two clients, when you find a source for this file

In this way we can validate each part as correct, as all clients have a hashset. When we download a complete part it is hashed, if it matches the part hash in the hashset it is correct and will be uploaded to other clients. If it is invalid, the part is dropped or a fix is attempted.
User is offlineProfile CardPM
Go to the top of the page
Post #7
KuSh
Jan 19 2004, 13:55
+Quote Post





Group: Developers
Posts: 1182
Joined: 15-December 03



it is just a MD4 hash function on the all file ...

magic value and additions you're talking about are part of the MD4 algorithm

try to search the web on hashing functions (google rocks) ... i'm not an expert in english to be able to explain you precisely how it works and for what it is used

KuSh
User is offlineProfile CardPM
Go to the top of the page
Post #8
morevit
Jan 19 2004, 13:56
+Quote Post





Group: Retired Devs
Posts: 1066
Joined: 10-September 03
From: US East Coast



He was asking to see the actual hashing algorithm moose.
User is offlineProfile CardPM
Go to the top of the page
Post #9
morevit
Jan 19 2004, 13:57
+Quote Post





Group: Retired Devs
Posts: 1066
Joined: 10-September 03
From: US East Coast



QUOTE (KuSh @ Jan 19 2004, 08:55)
it is just a MD4 hash function on the all file ...

magic value and additions you're talking about are part of the MD4 algorithm

try to search the web on hashing functions (google rocks) ... i'm not an expert in english to be able to explain you precisely how it works and for what it is used

KuSh

Saying "MD4 MD4" doesn't explain anything. tongue.gif I was just trying to give a high level view of how the algorithm works.
User is offlineProfile CardPM
Go to the top of the page
Post #10
moosetea
Jan 19 2004, 13:58
+Quote Post





Group: Retired Devs
Posts: 803
Joined: 7-February 03



well its not that complicated as long as you know the theory. You can put together a filehasher in only few lines of vb.net code (if you have the c# md4 library/class).


The md4 RFC, if you want to know how MD4 works
http://community.roxen.com/developers/idocs/rfc/rfc1320.html

This post has been edited by moosetea: Jan 19 2004, 14:02
User is offlineProfile CardPM
Go to the top of the page
Post #11
reanimated838uk
Jan 19 2004, 14:10
+Quote Post


Nutcracker


Group: Betatesters
Posts: 5198
Joined: 1-March 03
From: UK



Is that MD4 Optimisation code by Aw3 related to this hashing?... because ive noticed the hashing for RC2 is a lot smoother and less demanding than previous versions and was wondering if it was related to that.
User is offlineProfile CardPM
Go to the top of the page
Post #12
KuSh
Jan 19 2004, 14:13
+Quote Post





Group: Developers
Posts: 1182
Joined: 15-December 03



QUOTE (morevit @ Jan 19 2004, 13:57)
Saying "MD4 MD4" doesn't explain anything. tongue.gif I was just trying to give a high level view of how the algorithm works.

i know that sayin' "MD4 MD4" (and i'm not sure it was exactly what i've said blink.gif ) doesn't explain everything but posting a snippet of code without explanations (and hashing algorithms and codes are really difficult to understand without knowing what does a hashing function and for what it is used) doesn't give more help ...

i've just talled him to search the web with keywords cause i'm not able (due to my foreign language difficiculties) to explain correctly what i want to explain ...

sorry for the uneeded posts ... will try to do my best

QUOTE (reanimated838uk @ Jan 19 2004, 14:10 )
Is that MD4 Optimisation code by Aw3 related to this hashing?... because ive noticed the hashing for RC2 is a lot smoother and less demanding than previous versions and was wondering if it was related to that.

yup they're related !!!

KuSh

This post has been edited by KuSh: Jan 19 2004, 14:16
User is offlineProfile CardPM
Go to the top of the page
Post #13
reanimated838uk
Jan 19 2004, 14:19
+Quote Post


Nutcracker


Group: Betatesters
Posts: 5198
Joined: 1-March 03
From: UK



Morevit isnt saying you said "MD4 MD4" exactly, he just means bringing its name up a lot of times but not actually going through its code process.
I don't understand it too but mainly because my code skills really suck smile.gif
User is offlineProfile CardPM
Go to the top of the page
Post #14
moosetea
Jan 19 2004, 14:31
+Quote Post





Group: Retired Devs
Posts: 803
Joined: 7-February 03



Hmm but Md4, Md4 is about right ;P. An edonkey hash is the Md4 of all the parts. nb this hashes are theortical and not valid but essentially

Calculate all the part md4s (the hashset)
Md4(Part1) == 1FFFFFFFFFFFFFFF
Md4(Part2) == 2FFFFFFFFFFFFFFF
Md4(Part3) == 3FFFFFFFFFFFFFFF


Md4(1FFFFFFFFFFFFFF2FFFFFFFFFFFFFFF3FFFFFFFFFFFFFFFF) = File hash (ie 123FFFFFFFFFFFFF)

yeah the recent change is to do with hashing speed. I believe the change only works on smaller data, and as we only ever hash 9 or so mb at a time we get a big performance boost by changing some maths.

This post has been edited by moosetea: Jan 19 2004, 14:36
User is offlineProfile CardPM
Go to the top of the page
Post #15

2 Pages  1 2 >
Reply to this topicTopic OptionsStart new topic
 

Lo-Fi Version Time is now: 21st May 2013 - 14:18