Just a sample of the Echomail archive
Cooperative anarchy at its finest, still active today. Darkrealms is the Zone 1 Hub.
|    FMAIL_HELP    |    Fmail support    |    2,396 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 220 of 2,396    |
|    mark lewis to Wilfred van Velzen    |
|    FMail duplicate detection    |
|    23 Apr 14 21:18:57    |
      On Wed, 23 Apr 2014, Wilfred van Velzen wrote to mark lewis:               ml> what does FMail use for its duplicate detection, please?               ml> on this system, i see numerous false positive duplicates that have         ml> passed through the fastecho system of my uplink... i'm trying to         ml> track down why FMail would think they are duplicates when FE has         ml> not... the FE system is using its largest dupe database option...               WvV> First, I have never noticed a false positive. There were false         WvV> negatives, when the messages were too old to be in the dupe base.              what determines the age of "too old"?? simply the number of entries in the       database, the age of the post by creation date or the age of the post by       arrival time??               WvV> Regarding dupe detection by fmail, there are "clues" in the doc        WvV> file:               yes but i was asking so that "we" wouldn't have to go digging through       documentation and code that isn't forthcoming with a simple and straight       answer...               WvV> FMAIL.DUP Contains the database with signatures of        WvV> messages used by FMail to detect duplicate        WvV> messages. FMail keeps track of the last 16384        WvV> messages.              wow... understandable to a point... it brings up the question of what the       records consist of to fill 64K with only 16384 posts...               WvV> FMAIL32.DUP The 32-bit version of the duplicate detection         WvV> file. It is capable of keeping track of more        WvV> duplicates than the 16-bit DOS version. (max.        WvV> 9999*1024).               this seems inconsistent with the previous statement... one says "16384       messages" whereas this one seems to say 9999 messages with 1024 bytes (bits?)       per entry... see my above statement about digging through documentation with       no simple and straight forward answers...               WvV> Ignore MSGID               WvV> Normally FMail uses the MSGID of a message (if present)         WvV> for duplicate detection purposes. In some cases, this         WvV> may cause problems when different messages are having         WvV> the same MSGID: one or more of these messages will be         WvV> marked as duplicates although they are not. If you are         WvV> frequently experiencing these problems, try setting this         WvV> switch to 'Yes'.              that would seem to defeat the purpose of MSGID... especially if FMail is       expecting the MSGID to be unique across all message areas... in fact, this       brings up one of the flaws in the MSGID portion of the relevent FTSC standard       document... there is no specification of uniqueness across all message areas       or if the uniqueness is per message area... there are several well known       packages that operate on the "per area" basis which then causes false       positives in other packages... for that matter, there are some well known       packages that maintain duplicate databases on a per area basis instead of one       attempting to cover all message bases...               WvV> Dups recs (x1024) (32-bit mode only, start FSetupX with "/32")               is this true for all supported OSes? this shouldn't, IMHO, be necessary... the       tool should be able to detect which environment it is running in and use the       necessary means/methods/capabilities...               WvV> Number of signatures of messages that are stored on         WvV> disk.              i'm not understanding this since it was separated from the above and looking       like it was just floating...               WvV> So it depends on the version you are using and your settings. In        WvV> the .DUP file a crc32 of some parts of the message (depending on        WvV> your settings) is stored. If you want to know more about the        WvV> techincal details of that, look in the source:              thanks... but i asked so that              1. non-coders would have a simple straight forward answer              2. myself and others would not have to try to wade through alien code              3. everyone would benefit from an easy concise statement              [eg]        FMail takes a CRC16 and CRC32 of the binary message header plus         the first 60 bytes of the message body AS WELL AS a CRC16 and         CRC32 of the whole message body after the binary header AS WELL         AS a CRC16 and a CRC32 of the last 60 bytes of the message plus         all the SEENBY and PATH lines...               for 16bit systems, we store 16384 records of the above meaning         that only 16384 messages can be dupe checked... systems with more         messages may see duplicates.               for 32bit systems, we store 32768 records of the above meaning         that only 32768 messages can be dupe checkes... systems with more         messages may see duplicates               NOTE: FMail uses one duplicate database for ALL message areas. this         means that some messages will be detected as duplicates even if         they are in another message area. this can happen due to the method        used by some software when they post carbon copies or forwarded        copies of messages.       [/eg]              one catch to the above is when the CRCs are calculated... if they are       calculated after the AREA line has been removed during the toss into the local       base, that eliminates a valuable piece of data that can prevent false       positives... especially those across message areas...              then there's the question of does the duplicate detection have any effect on       the messages being passed on to other systems... depending on how things are       done in the process flow, it may be desirable to pass all messages on to all       other systems and let them detect what they believe are duplicates...       especially if they have a larger duplicate database capability and only 16384       messages are handled across all areas...                     with all of that said, i originally asked and hoped to get a simple and easy       to understand response so that none of the above would need to be written and       no one other than the developer would have to go digging into the code to try       to figure out what is really going on...              )\/(ark              One of the great tragedies of life is the murder of a beautiful theory by a       gang of brutal facts. --Benjamin Franklin              --- FMail/Win32 1.60        * Origin: (1:3634/12.71)    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca