如何将“at”替换为“@”

5

我有大约17k封电子邮件,内容包括订单,新闻,联系方式等,时间跨度为11年。

为了防止网络爬虫和垃圾邮件,用户的电子邮件地址被粗略地加密,将@更改为*@*'at'

我正在尝试创建逗号分隔列表以构建我们用户的数据库。

代码可以通过编写文件和循环文件夹来工作,因为如果我将发件人的电子邮件地址写入我目前正在使用邮件正文的文件中,则可以正常打印。

问题在于, Replace函数没有将*at*等替换为@

  1. 首先,为什么会这样呢?
  2. 是否有更好的方法来完成整个过程?
Private Sub Form_Load()

   Dim objOutlook As New Outlook.Application
   Dim objNameSpace As Outlook.NameSpace
   Dim objInbox As MAPIFolder
   Dim objFolder As MAPIFolder
   Dim fldName As String

   fldName = "TEST"

   ' Get the MAPI reference

   Set objNameSpace = objOutlook.GetNamespace("MAPI")

   ' Pick up the Inbox

   Set objInbox = objNameSpace.GetDefaultFolder(olFolderInbox)

   'Loop through the folders under the Inbox
   For Each objFolder In objInbox.Folders
       RecurseFolders fldName, objFolder
   Next objFolder

End Sub

Public Sub RecurseFolders(targetFolder As String, currentFolder As MAPIFolder)
   If currentFolder.Name = targetFolder Then
       GetEmails currentFolder
   Else
       Dim objFolder As MAPIFolder
       If currentFolder.Folders.Count > 0 Then
           For Each objFolder In currentFolder.Folders
               RecurseFolders targetFolder, objFolder
           Next
       End If
     End If
End Sub

Sub WriteToATextFile(e As String)
    MyFile = "c:\" & "emailist.txt"
    'set and open file for output
    fnum = FreeFile()
    Open MyFile For Append As fnum
    Print #fnum, e; ","
    Close #fnum
End Sub

Sub GetEmails(folder As MAPIFolder)
    Dim objMail As MailItem

    ' Read through all the items
    For i = 1 To folder.Items.Count
        Set objMail = folder.Items(i)
        GetEmail objMail.Body              
    Next i

End Sub

Sub GetEmail(s As String)
    Dim txt = s
    Do Until InStr(txt, "@") <= 0
        Dim tleft As Integer
        Dim tright As Integer
        Dim start As Integer
        Dim text As String
        Dim email As String

        text = Replace(text, " at ", "@", VbCompareMethod.vbTextCompare)
        text = Replace(text, "'at'", "@", VbCompareMethod.vbTextCompare)
        text = Replace(text, "*at*", "@", VbCompareMethod.vbTextCompare)
        text = Replace(text, "*at*", "@", VbCompareMethod.vbTextCompare)

        text = Replace(text, "<", " ", VbCompareMethod.vbTextCompare)
        text = Replace(text, ">", " ", VbCompareMethod.vbTextCompare)
        text = Replace(text, ":", " ", VbCompareMethod.vbTextCompare)

        'one two ab@bd.com one two
        tleft = InStr(text, "@") '11

        WriteToATextFile Str(tleft)
        WriteToATextFile Str(Len(text))

        start = InStrRev(text, " ", Len(text) - tleft)
        'WriteToATextFile Str(start)
        'WriteToATextFile Str(Len(text))
        'start = Len(text) - tleft
        text = left(text, start)
        'ab@bd.com one two

        tright = InStr(text, " ") '9
        email = left(text, tright)
        WriteToATextFile email

        text = right(text, Len(text) - Len(email))
        GetEmail txt
    Loop
End Sub

你尝试过使用 text = Replace(text, "at", "@", VbCompareMethod.vbTextCompare) 吗?即在 at 中不加空格。Replace 函数无法使用通配符。 - Jon49
是的,我需要那个空格,因为空格已经存在,我也需要将其删除。在所有的行上,替换功能都无法正常工作。 - lloydphillips
你能否给出一个实际的文本替换语句的例子?你也可以使用即时窗口进行测试。只需输入“?”然后输入一个测试用例即可。 - Jon49
3
您的“GetEmail”过程声明了一个名为“txt”的变量,并将其初始值设为参数“s”的值。然而,代码没有使用此变量,而是使用了“text”。这是您发帖中的笔误吗?还是实际上就是您的代码?另外,倒数第三行再次调用了该过程……就好像它是一个递归函数一样。 - Rachel Hettinger
2个回答

5
使用正则表达式(Regex)怎么样?类似于以下内容:

像这样:

Public Function ReplaceAT(ByVal sInput as String)
     Dim RegEx As Object
     Set RegEx = CreateObject("vbscript.regexp")
     With RegEx
      .Global = True
      .IgnoreCase = True
      .MultiLine = True
      .Pattern = "( at |'at'|<at>)"
     End With
     ReplaceAT = RegEx.Replace(sInput, "@")
     Set RegEx = Nothing
End Function

只需用可能出现的所有情况替换正则表达式即可。
有关更多提示和信息,请参见http://www.regular-expressions.info/


我建议您在使用正则表达式进行电子邮件验证之前和之后,可以先识别无效的电子邮件。 电子邮件验证 - brettdj

4
我已经尝试提取电子邮件地址,例如下面的示例,将在样本消息中突出显示的三个黄色电子邮件地址提取到CSV文件中。
  1. 任何有效的电子邮件都会被写入CSV文件 Set objTF = objFSO.createtextfile("c:\myemail.csv")
  2. 此代码扫描名为Inbox下的temp文件夹中的所有电子邮件 我删除了您递归测试和简化部分
  3. 有四个字符串操作
  4. 此行将任何非打印空格转换为普通空格 strMsgBody = Replace(strMsgBody, Chr(160), Chr(32)(不太可能,但在我的测试中发生了)
  5. Regex1将任何“ at ”或“at”等转换为“@” "(\s+at\s+|'at'|<at>|\*at\*|at)"
  6. Regex2将任何“ dot ”或“dot”等转换为“.” "(\s+dot\s+|'dot'|<dot>|\*dot\*|dot)"
  7. Regex3将任何“<”,“>”或“:”转换为“” .Pattern = "[<:>]"
  8. Regex4从电子邮件正文中提取任何有效电子邮件
  9. 使用objTF.writeline objRegM将任何有效电子邮件写入CSV文件

    enter image description here

以下是代码:

Public Test()
Dim objOutlook As New Outlook.Application
Dim objNameSpace As Outlook.NameSpace
Dim objFolder As MAPIFolder
Dim strfld As String
Dim objRegex As Object
Dim objRegMC As Object
Dim objRegM As Object
Dim objFSO As Object
Dim oMailItem As MailItem
Dim objTF As Object
Dim strMsgBody As String    
Set objRegex = CreateObject("vbscript.regexp")
Set objFSO = CreateObject("scripting.filesystemobject")
Set objTF = objFSO.createtextfile("c:\myemail.csv")

With objRegex
    .Global = True
    .MultiLine = True
    .ignorecase = True
    strfld = "temp"
    'Get the MAPI reference
    Set objNameSpace = objOutlook.GetNamespace("MAPI")
    'Pick up the Inbox
    Set objFolder = objNameSpace.GetDefaultFolder(olFolderInbox)
    Set objFolder = objFolder.Folders(strfld)
    For Each oMailItem In objFolder.Items
        strMsgBody = oMailItem.Body
        strMsgBody = Replace(strMsgBody, Chr(160), Chr(32))
        .Pattern = "(\s+at\s+|'at'|<at>|\*at\*|at)"
        strMsgBody = .Replace(strMsgBody, "@")
        .Pattern = "(\s+dot\s+|'dot'|<dot>|\*dot\*|dot)"
        strMsgBody = .Replace(strMsgBody, ".")
        .Pattern = "[<:>]"
        strMsgBody = .Replace(strMsgBody, vbNullString)
        .Pattern = "[\w-\.]{1,}\@([\da-zA-Z-]{1,}\.){1,}[\da-zA-Z-]{2,3}"
        If .Test(strMsgBody) Then
            Set objRegMC = .Execute(strMsgBody)
            For Each objRegM In objRegMC
                objTF.writeline objRegM
            Next
        End If
    Next
End With
objTF.Close
End Sub

1
好棒的表现!至少值得一个+1 :) - JMax

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接