I have a collection of email that has followed me around since the mid-2000s, moving between web hosts, GMail, Outlook.com and iCloud as they've added useful features, I've moved from Android to iOS etc. I've come across 2 problems as i've moved mail around over the years:
- After importing into GMail from other IMAP accounts everything looks great in GMail, but on moving the mail to another host the received date changes to the date of the GMail import.
- On a few occasions, moving mail around using different desktop email clients, MIME multipart messages have broken. I end up seeing the raw source, not the HTML mail, attachments etc.
When you move from another IMAP account into GMail, using the GMail online account import tool, everything looks great. Unfortunately when later copying mail somewhere else (e.g. iCloud) the received dates may show up incorrectly.
- Download email from Google Takeout - gives you an mbox file containing all mail. Unfortunately this loses any folder structure, but never mind.
- Examine the mbox file. The messages imported by gmail will have
additional headers, inserted at the top of each mail, e.g.
From 1245753982836402098@xxx Sat Aug 25 12:06:17 +0000 2007 Delivered-To: firstname.lastname@example.org Received: by 10.107.187.193 with SMTP id l184csp149097iof; Mon, 21 Sep 2015 16:49:31 -0700 (PDT) X-Received: by 10.107.170.32 with SMTP id t32mr30219550ioe.173.1442879371734; Mon, 21 Sep 2015 16:49:31 -0700 (PDT) Received: from 303668833448.apps.googleusercontent.com named unknown by gmailapi.google.com with HTTPREST; Mon, 21 Sep 2015 19:49:31 -0400 Received: from web38814.mail.mud.yahoo.com (184.108.40.206) by spam2.34sp.com with SMTP; 25 Aug 2007 13:13:01 +0100
The original Received header shows the message was received 25 Aug 2007. Unfortunately clients other than gmail will display the 21 Sep 2015 date in the Received headers added by gmail.
To fix this remove the Gmail headers from each message in the mbox file. Can be accomplished with some creative regex e.g. in sublime text. The headers vary a little between imports I've seen. The fixed mbox file can then be imported into a mail client.
After using a number of different clients to move mail around over several years (Thunderbird, Outlook, OSX Mail, Windows Live Mail) I've often seen some MIME multipart emails, with HTML and attachments become broken. The message is moved or copied but will no longer display properly - I see the raw source of all the MIME parts, cannot view attachments etc.
The problem turns out that somehow the clients or servers inserted spurious Content-Type headers, above the original Content-Type header for the MIME mail. The added header prevents the message being read correctly.
From: xxxxxxx <email@example.com> Sender: <firstname.lastname@example.org> To: "xxxxxx" References: <df696fdc-b801-4be8-bcb0-f3d59169dcba@SwitchService> Date: Wed, 11 Sep 2013 16:19:50 -0500 Message-ID: <06460AE9-3D64-4422-88A2-C05D869A22BC@xxxxxxxxx> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Outlook 15.0 Content-Language: en-us X-Google-Sender-Auth: 1IItZObPtrZmxbo-5dealV_naxQ Content-type: multipart/alternative; boundary="B_3521739430_567641463" > This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible. --B_3521739430_567641463 Content-type: text/plain; charset="UTF-8" Content-transfer-encoding: 7bit
In an email client I see all the source, from the original MIME Content-Type header…
Content-type: multipart/alternative; boundary="B_3521739430_567641463"
… which is being overridden by the header further up, which has crept in at some point:
MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit
Once this block is removed the message is correctly recognized as MIME multipart, and displays properly. This can also be fixed in an mbox file with some find and replace.